WO2021140601A1 - Dispositif de traitement d'image, système d'endoscope et procédé de traitement d'image - Google Patents

Dispositif de traitement d'image, système d'endoscope et procédé de traitement d'image Download PDF

Info

Publication number
WO2021140601A1
WO2021140601A1 PCT/JP2020/000376 JP2020000376W WO2021140601A1 WO 2021140601 A1 WO2021140601 A1 WO 2021140601A1 JP 2020000376 W JP2020000376 W JP 2020000376W WO 2021140601 A1 WO2021140601 A1 WO 2021140601A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
observation method
detection
threshold value
score
Prior art date
Application number
PCT/JP2020/000376
Other languages
English (en)
Japanese (ja)
Inventor
文行 白谷
Original Assignee
オリンパス株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by オリンパス株式会社 filed Critical オリンパス株式会社
Priority to PCT/JP2020/000376 priority Critical patent/WO2021140601A1/fr
Publication of WO2021140601A1 publication Critical patent/WO2021140601A1/fr

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/04Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor combined with photographic or television appliances
    • A61B1/045Control thereof

Definitions

  • the present invention relates to an image processing system, an endoscope system, an image processing method, and the like.
  • the object detection system when the object detection score indicating the object-likeness exceeds a preset threshold value, a candidate frame for object detection is displayed on the screen. After learning the object detection system, this threshold is adjusted using the evaluation image set and fixed at a certain value. Adjustment of this threshold is required to adjust the trade-off between positive detection and over-detection.
  • sensitivity may be prioritized or overdetection suppression may be prioritized.
  • priority is given to reducing oversight, and an increase in overdetection is allowed.
  • priority is given to suppression of overdetection, and a decrease in sensitivity is allowed.
  • Patent Document 1 discloses a method for designating either an overdetection suppression mode or an undetected defect detection mode.
  • the over-detection suppression mode when the over-detection suppression mode is specified, defect candidate parts excluding pseudo-defects that do not need to be detected are extracted from the defect candidate parts, and when the detection mode for undetected defects is specified, defect candidates are extracted. This defective part is extracted from the parts.
  • the parameter corresponding to the feature amount of the pseudo-defect portion is used, and in the detection mode of the undetected defect, the parameter corresponding to the feature amount of the present defect portion is used.
  • the parameter is specifically a threshold value.
  • the image to be processed may be captured by various observation methods.
  • the threshold value of the detection score is fixed to one, the sensitivity and overdetection rate will vary depending on the observation method, and overdetection will be sufficiently suppressed depending on the observation method.
  • Patent Document 1 does not disclose a method of switching parameters according to an observation method.
  • an image processing system an endoscope system, an image processing method, etc. that can output a detection result according to a situation even when the observation method changes.
  • One aspect of the present disclosure includes an image acquisition unit that acquires a processing target image and a processing unit that performs processing on the processing target image, and the processing unit captures the processing target image in the first observation method.
  • the first classification score indicating the certainty and the second classification score indicating the certainty that the processed image was captured by the second observation method were obtained, and the region of interest was detected and detected in the processed image.
  • a detection score representing the certainty of the region of interest is obtained, a threshold value is set based on the first classification score and the second classification score, the set threshold value is compared with the detection score, and the threshold value is compared. It is related to an image processing system that outputs a detection result of the region of interest when the detection score is larger than that of the above.
  • Another aspect of the present disclosure includes an imaging unit that captures an in-vivo image, an image acquisition unit that acquires the in-vivo image as a processing target image, and a processing unit that performs processing on the processing target image.
  • the processing unit includes a first classification score indicating the certainty that the processed image is captured by the first observation method, and a second classification score indicating the certainty that the processed image is captured by the second observation method. Is obtained, a region of interest is detected in the image to be processed, a detection score indicating the certainty of the detected region of interest is obtained, and a threshold value is set based on the first classification score and the second classification score. It is related to an endoscopic system that compares the set threshold value with the detection score and outputs the detection result of the region of interest when the detection score is larger than the threshold value.
  • Yet another aspect of the present disclosure is a first classification score indicating the certainty that the processed image is acquired and the processed image is captured in the first observation method, and the processed image is captured in the second observation method.
  • the second classification score representing the certainty was obtained
  • the region of interest was detected in the processed image
  • the detection score representing the detected certainty of the region of interest was obtained
  • the first classification score and the first classification score were obtained.
  • An image processing method in which a threshold value is set based on the two classification scores, the set threshold value is compared with the detection score, and when the detection score is larger than the threshold value, the detection result of the region of interest is output.
  • FIG. 6A is a diagram for explaining the input and output of the region of interest detector
  • FIG. 6B is a diagram for explaining the input and output of the observation method classifier.
  • a configuration example of the learning device according to the first embodiment A configuration example of the image processing system according to the first embodiment.
  • the flowchart explaining the detection process in 1st Embodiment A configuration example of a neural network that is a detection-integrated observation method classifier.
  • Observation methods include normal light observation, which is an observation method in which imaging is performed by irradiating normal light as illumination light, special light observation, which is an observation method in which imaging is performed by irradiating special light as illumination light, and dye as a subject. It is conceivable to observe dye spraying, which is an observation method in which imaging is performed while the light is sprayed.
  • the image captured in normal light observation is referred to as a normal light image
  • the image captured in special light observation is referred to as a special light image
  • the image captured in dye spray observation is referred to as a dye spray image. Notated as.
  • Normal light is light having intensity in a wide wavelength band among the wavelength bands corresponding to visible light, and is white light in a narrow sense.
  • the special light is light having different spectral characteristics from ordinary light, and is, for example, narrow band light having a narrower wavelength band than ordinary light.
  • NBI Near Band Imaging
  • the special light may include light in a wavelength band other than visible light such as infrared light.
  • Lights of various wavelength bands are known as special lights used for special light observation, and they can be widely applied in the present embodiment.
  • the dye in the dye application observation is, for example, indigo carmine. By spraying indigo carmine, it is possible to improve the visibility of polyps.
  • Various types of dyes and combinations of target regions of interest are also known, and they can be widely applied in the dye application observation of the present embodiment.
  • the detection score is an index value indicating the certainty of the detection result.
  • the image to be processed is an in-vivo image and the detection target is a region of interest.
  • the detection target is a region of interest.
  • the region of interest in the present embodiment is an region in which the priority of observation for the user is relatively higher than that of other regions. If the user is a doctor performing diagnosis or treatment, the area of interest corresponds to, for example, the area where the lesion is imaged.
  • the region of interest may be a region that captures the foam portion or stool portion. That is, the object to be noticed by the user differs depending on the purpose of observation, but when observing the object, the area in which the priority of observation for the user is relatively higher than the other areas is the area of interest.
  • the region of interest is a lesion or a polyp.
  • the observation method for imaging the subject changes, such as the doctor switching the illumination light between normal light and special light, and spraying pigment on the body tissues. Due to this change in the observation method, the detection results vary. For example, when a detection result that appropriately detects a region of interest is obtained and a normal optical image is targeted, the detection score associated with the detection result tends to be large, and a special optical image is targeted. In some cases, there is a difference that the detection score tends to be small.
  • the sensitivity is information indicating the ratio of how many attention regions are appropriately detected in the attention region captured in the input image.
  • Such a detection mode can be realized by adjusting the threshold value so that the sensitivity when the evaluation image is input becomes x%.
  • the tendency of the detection score may differ depending on the observation method.
  • the sensitivity can be set to about x% even if the threshold value is relatively high.
  • the sensitivity cannot be set to about x% unless the threshold value is relatively low.
  • the threshold value When the threshold value is adjusted using a normal light image as an evaluation image, the desired sensitivity cannot be obtained when a special light image is input, which is inappropriate as a sensitivity priority mode. On the other hand, when the threshold value is adjusted using a special optical image, the sensitivity becomes excessively high when the normal optical image is input. Therefore, the over-detection rate may increase when inputting a normal optical image.
  • Over-detection is to erroneously detect a region that is not the region of interest as the region of interest. Further, in the following, information indicating the number of over-detections (locations / sheets) per unit image number is referred to as an over-detection rate. For example, when the threshold adjustment is performed using a normal optical image as an evaluation image in order to realize an overdetection suppression mode that satisfies the condition that the overdetection rate is near y, it is excessive when a special optical image is input. The detection rate may deviate from y. Further, when the threshold value is adjusted using the special optical image as the evaluation image, the overdetection rate may deviate from y when the normal optical image is input.
  • the sensitivity and overdetection rate vary due to changes in the observation method.
  • Conventional methods such as Patent Document 1 do not take into account changes in the observation method.
  • the conventional method for example, when the sensitivity priority mode is used, consistent detection processing cannot be performed, and the sensitivity differs depending on the observation method.
  • the over-detection suppression mode consistent detection processing cannot be performed, and the over-detection rate differs depending on the observation method.
  • the consistent detection process means that, for example, a numerical value representing sensitivity or a numerical value representing an overdetection rate is in the vicinity of a reference value regardless of the observation method.
  • the desired sensitivity may not be obtained even in the sensitivity priority mode, or the overdetection may not be sufficiently suppressed even in the overdetection suppression mode.
  • the threshold is set based on. Then, the set threshold value and the detection score are compared, and when the detection score is larger than the threshold value, the detection result of the region of interest is output.
  • the threshold value is dynamically adjusted according to the determination result of the observation method of the image to be processed, consistent detection processing can be realized even for images having different observation methods.
  • the detection process in which the detection sensitivity is consistently prioritized regardless of the observation method, or the detection process in which the suppression of over-detection is consistently prioritized can be executed. As a result, it becomes possible to provide a system capable of stable diagnostic support even when various observation methods are assumed.
  • FIG. 1 is a configuration example of a system including the image processing system 200.
  • the system includes a learning device 100, an image processing system 200, and an endoscope system 300.
  • the system is not limited to the configuration shown in FIG. 1, and various modifications such as omitting some of these components or adding other components can be performed.
  • the learning device 100 generates a trained model by performing machine learning.
  • the endoscope system 300 captures an in-vivo image with an endoscope imaging device.
  • the image processing system 200 acquires an in-vivo image as a processing target image. Then, the image processing system 200 operates according to the trained model generated by the learning device 100 to perform detection processing of the region of interest for the image to be processed.
  • the endoscope system 300 acquires and displays the detection result. In this way, by using machine learning, it becomes possible to realize a system that supports diagnosis by a doctor or the like.
  • the learning device 100, the image processing system 200, and the endoscope system 300 may be provided as separate bodies, for example.
  • the learning device 100 and the image processing system 200 are information processing devices such as a PC (Personal Computer) and a server system, respectively.
  • the learning device 100 may be realized by distributed processing by a plurality of devices.
  • the learning device 100 may be realized by cloud computing using a plurality of servers.
  • the image processing system 200 may be realized by cloud computing or the like.
  • the endoscope system 300 is a device including an insertion unit 310, a system control device 330, and a display unit 340, for example, as will be described later with reference to FIG.
  • a part or all of the system control device 330 may be realized by a device such as a server system via a network.
  • a part or all of the system control device 330 is realized by cloud computing.
  • one of the image processing system 200 and the learning device 100 may include the other.
  • the image processing system 200 (learning device 100) is a system that executes both a process of generating a learned model by performing machine learning and a detection process according to the learned model.
  • one of the image processing system 200 and the endoscope system 300 may include the other.
  • the system control device 330 of the endoscope system 300 includes an image processing system 200.
  • the system control device 330 executes both the control of each part of the endoscope system 300 and the detection process according to the trained model.
  • a system including all of the learning device 100, the image processing system 200, and the system control device 330 may be realized.
  • a server system composed of one or a plurality of servers generates a trained model by performing machine learning, a detection process according to the trained model, and control of each part of the endoscopic system 300. May be executed.
  • a server system composed of one or a plurality of servers generates a trained model by performing machine learning, a detection process according to the trained model, and control of each part of the endoscopic system 300. May be executed.
  • the specific configuration of the system shown in FIG. 1 can be modified in various ways.
  • FIG. 2 is a configuration example of the learning device 100.
  • the learning device 100 includes an image acquisition unit 110 and a learning unit 120.
  • the image acquisition unit 110 acquires a learning image.
  • the image acquisition unit 110 is, for example, a communication interface for acquiring a learning image from another device.
  • the learning image is an image in which correct answer data is added as metadata to, for example, a normal light image, a special light image, a dye spray image, or the like.
  • the learning unit 120 generates a trained model by performing machine learning based on the acquired learning image. The details of the data used for machine learning and the specific flow of the learning process will be described later.
  • the learning unit 120 is composed of the following hardware.
  • the hardware can include at least one of a circuit that processes a digital signal and a circuit that processes an analog signal.
  • hardware can consist of one or more circuit devices mounted on a circuit board or one or more circuit elements.
  • One or more circuit devices are, for example, ICs (Integrated Circuits), FPGAs (field-programmable gate arrays), and the like.
  • One or more circuit elements are, for example, resistors, capacitors, and the like.
  • the learning unit 120 may be realized by the following processor.
  • the learning device 100 includes a memory that stores information and a processor that operates based on the information stored in the memory.
  • the information is, for example, a program and various data.
  • the processor includes hardware.
  • various processors such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and a DSP (Digital Signal Processor) can be used.
  • the memory may be a semiconductor memory such as SRAM (Static Random Access Memory) or DRAM (Dynamic Random Access Memory), a register, or a magnetic storage device such as an HDD (Hard Disk Drive). It may be an optical storage device such as an optical disk device.
  • the memory stores instructions that can be read by a computer, and when the instructions are executed by the processor, the functions of each part of the learning unit 120 are realized as processing.
  • Each part of the learning unit 120 is, for example, each part described later with reference to FIGS. 7 and 14.
  • the instruction here may be an instruction of an instruction set constituting a program, or an instruction instructing an operation to a hardware circuit of a processor.
  • FIG. 3 is a configuration example of the image processing system 200.
  • the image processing system 200 includes an image acquisition unit 210, a processing unit 220, and a storage unit 230.
  • the image acquisition unit 210 acquires an in-vivo image captured by the imaging device of the endoscope system 300 as a processing target image.
  • the image acquisition unit 210 is realized as a communication interface for receiving an in-vivo image from the endoscope system 300 via a network.
  • the network here may be a private network such as an intranet or a public communication network such as the Internet.
  • the network may be wired or wireless.
  • the processing unit 220 performs detection processing of the region of interest in the image to be processed by operating according to the trained model. Further, the processing unit 220 determines the information to be output based on the detection result of the trained model.
  • the processing unit 220 is composed of hardware including at least one of a circuit for processing a digital signal and a circuit for processing an analog signal.
  • hardware can consist of one or more circuit devices mounted on a circuit board or one or more circuit elements.
  • the processing unit 220 may be realized by the following processor.
  • the image processing system 200 includes a memory that stores information such as a program and various data, and a processor that operates based on the information stored in the memory.
  • the memory here may be the storage unit 230 or may be a different memory.
  • various processors such as GPU can be used.
  • the memory can be realized by various aspects such as a semiconductor memory, a register, a magnetic storage device, and an optical storage device.
  • the memory stores instructions that can be read by a computer, and when the instructions are executed by the processor, the functions of each part of the processing unit 220 are realized as processing.
  • Each part of the processing unit 220 is, for example, each part described later with reference to FIGS. 8 and 11.
  • the storage unit 230 serves as a work area for the processing unit 220 and the like, and its function can be realized by a semiconductor memory, a register, a magnetic storage device, or the like.
  • the storage unit 230 stores the image to be processed acquired by the image acquisition unit 210. Further, the storage unit 230 stores the information of the trained model generated by the learning device 100.
  • FIG. 4 is a configuration example of the endoscope system 300.
  • the endoscope system 300 includes an insertion unit 310, an external I / F unit 320, a system control device 330, a display unit 340, and a light source device 350.
  • the insertion portion 310 is a portion whose tip side is inserted into the body.
  • the insertion unit 310 includes an objective optical system 311, an image sensor 312, an actuator 313, an illumination lens 314, a light guide 315, and an AF (Auto Focus) start / end button 316.
  • the light guide 315 guides the illumination light from the light source 352 to the tip of the insertion portion 310.
  • the illumination lens 314 irradiates the subject with the illumination light guided by the light guide 315.
  • the objective optical system 311 forms an image of the reflected light reflected from the subject as a subject image.
  • the objective optical system 311 includes a focus lens, and the position where the subject image is formed can be changed according to the position of the focus lens.
  • the actuator 313 drives the focus lens based on the instruction from the AF control unit 336.
  • AF is not indispensable, and the endoscope system 300 may be configured not to include the AF control unit 336.
  • the image sensor 312 receives light from the subject that has passed through the objective optical system 311.
  • the image pickup device 312 may be a monochrome sensor or an element provided with a color filter.
  • the color filter may be a widely known bayer filter, a complementary color filter, or another filter.
  • Complementary color filters are filters that include cyan, magenta, and yellow color filters.
  • the AF start / end button 316 is an operation interface for the user to operate the AF start / end.
  • the external I / F unit 320 is an interface for inputting from the user to the endoscope system 300.
  • the external I / F unit 320 includes, for example, an AF control mode setting button, an AF area setting button, an image processing parameter adjustment button, and the like.
  • the system control device 330 performs image processing and control of the entire system.
  • the system control device 330 includes an A / D conversion unit 331, a pre-processing unit 332, a detection processing unit 333, a post-processing unit 334, a system control unit 335, an AF control unit 336, and a storage unit 337.
  • the A / D conversion unit 331 converts the analog signals sequentially output from the image sensor 312 into a digital image, and sequentially outputs the analog signals to the preprocessing unit 332.
  • the pre-processing unit 332 performs various correction processes on the in-vivo images sequentially output from the A / D conversion unit 331, and sequentially outputs them to the detection processing unit 333 and the AF control unit 336.
  • the correction process includes, for example, a white balance process, a noise reduction process, and the like.
  • the detection processing unit 333 performs a process of transmitting, for example, an image after correction processing acquired from the preprocessing unit 332 to an image processing system 200 provided outside the endoscope system 300.
  • the endoscope system 300 includes a communication unit (not shown), and the detection processing unit 333 controls the communication of the communication unit.
  • the communication unit here is a communication interface for transmitting an in-vivo image to the image processing system 200 via a given network.
  • the detection processing unit 333 performs a process of receiving the detection result from the image processing system 200 by controlling the communication of the communication unit.
  • the system control device 330 may include an image processing system 200.
  • the A / D conversion unit 331 corresponds to the image acquisition unit 210.
  • the storage unit 337 corresponds to the storage unit 230.
  • the pre-processing unit 332, the detection processing unit 333, the post-processing unit 334, and the like correspond to the processing unit 220.
  • the detection processing unit 333 operates according to the information of the learned model stored in the storage unit 337 to perform the detection processing of the region of interest for the in-vivo image which is the processing target image.
  • the trained model is a neural network
  • the detection processing unit 333 performs forward arithmetic processing on the input processing target image using the weight determined by learning. Then, the detection result is output based on the output of the output layer.
  • the post-processing unit 334 performs post-processing based on the detection result in the detection processing unit 333, and outputs the image after the post-processing to the display unit 340.
  • various processes such as emphasizing the recognition target in the image and adding information representing the detection result can be considered.
  • the post-processing unit 334 performs post-processing to generate a display image by superimposing the detection frame detected by the detection processing unit 333 on the image output from the pre-processing unit 332.
  • the system control unit 335 is connected to the image sensor 312, the AF start / end button 316, the external I / F unit 320, and the AF control unit 336, and controls each unit. Specifically, the system control unit 335 inputs and outputs various control signals.
  • the AF control unit 336 performs AF control using images sequentially output from the preprocessing unit 332.
  • the display unit 340 sequentially displays the images output from the post-processing unit 334.
  • the display unit 340 is, for example, a liquid crystal display, an EL (Electro-Luminescence) display, or the like.
  • the light source device 350 includes a light source 352 that emits illumination light.
  • the light source 352 may be a xenon light source, an LED, or a laser light source. Further, the light source 352 may be another light source, and the light emitting method is not limited.
  • the light source device 350 can irradiate normal light and special light.
  • the light source device 350 includes a white light source and a rotation filter, and can switch between normal light and special light based on the rotation of the rotation filter.
  • the light source device 350 has a configuration capable of irradiating a plurality of lights having different wavelength bands by including a plurality of light sources such as a red LED, a green LED, a blue LED, a green narrow band light LED, and a blue narrow band light LED. You may.
  • the light source device 350 irradiates normal light by lighting a red LED, a green LED, and a blue LED, and irradiates special light by lighting a green narrow band light LED and a blue narrow band light LED.
  • various configurations of a light source device that irradiates normal light and special light are known, and they can be widely applied in the present embodiment.
  • the first observation method is normal light observation and the second observation method is special light observation will be described.
  • the second observation method may be dye spray observation. That is, in the following description, the notation of special light observation or special light image can be appropriately read as dye spray observation and dye spray image.
  • machine learning using a neural network will be described. That is, the region of interest detector and the observation method classifier described below are, for example, trained models using a neural network.
  • the method of the present embodiment is not limited to this.
  • machine learning using another model such as SVM (support vector machine) may be performed, or machine learning using a method developed from various methods such as a neural network or SVM. May be done.
  • SVM support vector machine
  • FIG. 5A is a schematic diagram illustrating a neural network.
  • the neural network has an input layer into which data is input, an intermediate layer in which operations are performed based on the output from the input layer, and an output layer in which data is output based on the output from the intermediate layer.
  • a network in which the intermediate layer is two layers is illustrated, but the intermediate layer may be one layer or three or more layers.
  • the number of nodes (neurons) included in each layer is not limited to the example of FIG. 5 (A), and various modifications can be performed. Considering the accuracy, it is desirable to use deep learning using a multi-layer neural network for the learning of this embodiment.
  • the term "multilayer” here means four or more layers in a narrow sense.
  • the nodes included in a given layer are combined with the nodes in the adjacent layer.
  • a weighting coefficient is set for each bond.
  • Each node multiplies the output of the node in the previous stage by the weighting coefficient to obtain the total value of the multiplication results.
  • each node adds a bias to the total value and obtains the output of the node by applying an activation function to the addition result.
  • activation functions By sequentially executing this process from the input layer to the output layer, the output of the neural network is obtained.
  • Various functions such as a sigmoid function and a ReLU function are known as activation functions, and these can be widely applied in the present embodiment.
  • the weighting coefficient here includes a bias.
  • the learning device 100 inputs the input data of the training data to the neural network, and obtains the output by performing a forward calculation using the weighting coefficient at that time.
  • the learning unit 120 of the learning device 100 calculates an error function based on the output and the correct answer data of the training data. Then, the weighting coefficient is updated so as to reduce the error function.
  • an error backpropagation method in which the weighting coefficient is updated from the output layer to the input layer can be used.
  • FIG. 5B is a schematic diagram illustrating CNN.
  • the CNN includes a convolutional layer and a pooling layer that perform a convolutional operation.
  • the convolution layer is a layer to be filtered.
  • the pooling layer is a layer that performs a pooling operation that reduces the size in the vertical direction and the horizontal direction.
  • the example shown in FIG. 5B is a network in which the output is obtained by performing the calculation by the convolution layer and the pooling layer a plurality of times and then performing the calculation by the fully connected layer.
  • the fully connected layer is a layer that performs arithmetic processing when all the nodes of the previous layer are connected to the nodes of a given layer, and the arithmetic of each layer described above is performed using FIG. 5 (A). Correspond. Although the description is omitted in FIG. 5B, the CNN also performs arithmetic processing by the activation function.
  • Various configurations of CNNs are known, and they can be widely applied in the present embodiment. For example, as the CNN of the present embodiment, a known RPN or the like (Region Proposal Network) can be used.
  • the processing procedure is the same as in FIG. 5 (A). That is, the learning device 100 inputs the input data of the training data to the CNN, and obtains an output by performing a filter process or a pooling operation using the filter characteristics at that time. An error function is calculated based on the output and the correct answer data, and the weighting coefficient including the filter characteristic is updated so as to reduce the error function.
  • the backpropagation method can be used.
  • the detection process of the region of interest executed by the image processing system 200 is specifically a process of detecting at least one of the presence / absence, position, size, and shape of the region of interest.
  • the detection process is a process of obtaining information for specifying a rectangular frame area surrounding a region of interest and a detection score indicating the certainty of the frame area.
  • the frame area is referred to as a detection frame.
  • the information that identifies the detection frame is, for example, the coordinate value on the horizontal axis of the upper left end point of the detection frame, the coordinate value on the vertical axis of the end point, the length in the horizontal axis direction of the detection frame, and the length in the vertical axis direction of the detection frame. , And four numerical values. Since the aspect ratio of the detection frame changes as the shape of the region of interest changes, the detection frame corresponds to information representing the shape as well as the presence / absence, position, and size of the region of interest.
  • FIG. 7 is a configuration example of the learning device 100 according to the first embodiment.
  • the learning unit 120 of the learning device 100 includes a detection learning unit 121 and an observation method classification learning unit 122.
  • the detection learning unit 121 acquires the image group A1 from the image acquisition unit 110 and performs machine learning based on the image group A1 to generate a region of interest detector.
  • the learning process executed by the detection learning unit 121 is a learning process for generating a learned model applicable to both a normal light image and a special light image. That is, the image group A1 includes a learning image to which detection data which is information related to at least one of the presence / absence, a position, a size, and a shape of a region of interest is added to a normal optical image, and a special optical image. Includes a learning image to which detection data is added.
  • the detection data is mask data in which the polyp area to be detected and the background area are painted in different colors.
  • the detection data may be information for identifying a detection frame surrounding the polyp.
  • the detection frame is not limited to a rectangular frame, and may be an elliptical frame or the like as long as it surrounds the vicinity of the polyp region.
  • FIG. 6A is a diagram illustrating the input and output of the region of interest detector.
  • the region of interest detector receives the image to be processed as an input, performs processing on the image to be processed, and outputs information representing the detection result.
  • the detection learning unit 121 performs machine learning of a model including an input layer into which an image is input, an intermediate layer, and an output layer for outputting a detection result.
  • the region of interest detector is an object detection CNN such as RPN (Region Proposal Network), Faster R-CNN, and YOLO (You only Look Once).
  • the detection learning unit 121 uses the learning image included in the image group A1 as an input of the neural network, and performs a forward calculation based on the current weighting coefficient.
  • the detection learning unit 121 calculates the error between the output of the output layer and the detection data which is the correct answer data as an error function, and updates the weighting coefficient so as to reduce the error function.
  • the above is the process based on one learning image, and the detection learning unit 121 learns the weighting coefficient of the region of interest detector by repeating the above process.
  • the update of the weighting coefficient is not limited to the one performed in units of one sheet, and batch learning or the like may be used.
  • the image group A2 is a learning image in which observation method data, which is information for specifying an observation method, is added as correct answer data to a normal light image, and a learning image in which observation method data is added to a special optical image. It is an image group including an image.
  • the observation method data is, for example, a label representing either a normal light image or a special light image.
  • FIG. 6B is a diagram illustrating the input and output of the observation method classifier.
  • the observation method classifier receives the processing target image as an input, performs processing on the processing target image, and outputs information representing the observation method classification result.
  • the information representing the observation method classification result is, for example, the first classification score and the second classification score.
  • the observation method classification learning unit 122 performs machine learning of a model including an input layer into which an image is input and an output layer in which the observation method classification result is output.
  • the observation method classifier is, for example, an image classification CNN such as VGG16 or ResNet.
  • the observation method classification learning unit 122 uses the learning image included in the image group A2 as an input of the neural network, and performs a forward calculation based on the current weighting coefficient.
  • the detection learning unit 121 calculates the error between the output of the output layer and the observation method data which is the correct answer data as an error function, and updates the weighting coefficient so as to reduce the error function.
  • the observation method classification learning unit 122 learns the weighting coefficient of the observation method classifier by repeating the above processing.
  • the output of the output layer in the observation method classifier is, for example, data representing the certainty that the input image is a normal light image captured in normal light observation, and the input image is captured in special light observation. Includes data representing certainty, which is a special light image.
  • the output layer of the observation method classifier is a known softmax layer
  • the output layer outputs two probability data having a total of 1.
  • the data representing the certainty that the input image is a normal light image is referred to as a normal light score
  • the data representing the certainty that the input image is a special light image is referred to as a special light score.
  • the first classification score corresponds to the normal light score
  • the second classification score corresponds to the special light score.
  • the observation method classification learning unit 122 makes an error by using the data that the probability data that is the normal light image is 1 and the probability data that is the special light image is 0 as the correct answer data. Find the function. Further, in the observation method classification learning unit 122, when the label which is the correct answer data is a special light image, the data which the probability data which is a normal light image is 0 and the probability data which is a special light image is 1 is regarded as the correct answer data. Find the error function.
  • FIG. 8 is a configuration example of the image processing system 200 according to the first embodiment.
  • the processing unit 220 of the image processing system 200 includes an observation method classification unit 221, a threshold value setting unit 222, a detection processing unit 223, and an output processing unit 224.
  • the observation method classification unit 221 performs an observation method classification process based on the observation method classifier.
  • the threshold value setting unit 222 sets the threshold value used for the output processing of the detection result based on the result of the observation method classification processing.
  • the detection processing unit 223 performs detection processing using the region of interest detector.
  • the output processing unit 224 performs output processing based on the threshold value set by the threshold value setting unit 222 and the detection result of the detection processing unit 223.
  • FIG. 9 is a flowchart illustrating the processing of the image processing system 200 in the first embodiment.
  • the processing flow is not limited to FIG. 9, and various modifications can be performed.
  • the detection process in step S103 may be performed after the threshold value setting process in steps S104 to S106, or the detection process and the threshold value setting process may be performed in parallel.
  • each step will be described.
  • step S101 the image acquisition unit 210 acquires an in-vivo image captured by the endoscope imaging device as a processing target image.
  • the observation method classification unit 221 performs an observation method classification process for determining whether the image to be processed is a normal light image or a special light image. For example, the observation method classification unit 221 inputs the processing target image acquired by the image acquisition unit 210 into the observation method classifier, so that the normal light score indicating the probability that the processing target image is a normal light image and the processing target image are Acquires a special light score indicating the probability of being a special light image.
  • the detection processing unit 223 performs detection processing of the region of interest using the region of interest detector. Specifically, the detection processing unit 223 inputs the processing target image into the region of interest detector to obtain information on a predetermined number of detection frames in the processing target image and a detection score associated with the detection frame. get.
  • the detection result in the present embodiment represents, for example, a detection frame, and the detection score represents the certainty of the detection result.
  • the threshold value setting unit 222 sets the threshold value based on the observation method classification result. Specifically, first, in step S104, the threshold value setting unit 222 determines whether or not the observation method classification result represents normal light observation. For example, the threshold value setting unit 222 acquires the normal light score and the special light score from the observation method classification unit 221 and determines the magnitude relationship between them. The threshold setting unit 222 determines that the observation method is normal light observation when the normal light score is equal to or higher than the special light score, and when the normal light score is smaller than the special light score, the observation method is special light observation. Judge that there is.
  • the threshold value setting unit 222 sets the threshold value for normal light observation in step S105.
  • the threshold value setting unit 222 sets the threshold value for special light observation in step S106.
  • the storage unit 230 of the image processing system stores the threshold value Th1 acquired by using the normal light image as the evaluation image and the threshold value Th2 acquired by using the special light observation as the evaluation image. doing.
  • Th1 when realizing an overdetection suppression mode in which the overdetection rate is close to 0.05 (locations / image), Th1 has an overdetection rate of 0.05 when a normal optical image is input as an evaluation image. It is a threshold value set to be. Th2 is a threshold value set so that the over-detection rate when a special light image is input as an evaluation image is 0.05.
  • the threshold value setting unit 222 performs a process of setting Th1 as a threshold value in step S105, and a process of setting Th2 as a threshold value in step S106.
  • step S107 the output processing unit 224 executes the output processing of the detection result based on the detection result acquired in step S103 and the threshold value set in step S105 or S106. Specifically, the output processing unit 224 performs a process of comparing the detection score associated with the detection frame with the set threshold value. Then, the output processing unit 224 outputs the detection frame whose detection score is larger than the threshold value among the detection frames detected by the detection processing unit 223, and does not output the detection frame whose detection score is equal to or less than the threshold value.
  • the output process in step S107 is, for example, a process of generating a display image when the image processing system 200 is included in the endoscope system 300, and a process of displaying the display image on the display unit 340.
  • the output process is, for example, a process of transmitting a displayed image to the endoscope system 300.
  • the output process may be a process of transmitting information representing the detection frame to the endoscope system 300.
  • the display image generation process and display control are executed in the endoscope system 300.
  • the image processing system 200 has an image acquisition unit 210 that acquires the image to be processed and a processing unit that outputs a detection result that is the result of detecting the region of interest in the image to be processed. Includes 220.
  • the processing unit 220 has a first classification score indicating the certainty that the image to be processed has been captured in the first observation method, and the image to be processed has been captured in the second observation method.
  • the second classification score which represents the certainty, is obtained.
  • the processing unit 220 detects the region of interest in the image to be processed and obtains a detection score indicating the certainty of the detected region of interest.
  • the processing unit 220 sets the threshold value based on the first classification score and the second classification score. Then, as shown in step S107, the processing unit 220 compares the set threshold value with the detection score, and outputs the detection result of the region of interest when the detection score is larger than the threshold value.
  • the first classification score is a normal light score
  • the second classification score is a special light score
  • the second observation method may be dye spray observation
  • the second classification score in that case is information indicating the certainty that the image to be processed is a dye spray image.
  • an appropriate threshold value can be set based on the classification result of the observation method in which the image to be processed is captured. This makes it possible to suppress variations in sensitivity and over-detection according to the observation method of the image to be processed, and to perform consistent detection processing.
  • the process of obtaining the first classification score and the second classification score is performed based on the observation method classifier.
  • the process of obtaining the detection result and the detection score is performed based on the region of interest detector.
  • the processing based on each of the observation method classifier and the region of interest detector is realized by operating the processing unit 220 according to the instruction from the trained model.
  • the calculation in the processing unit 220 according to the trained model may be executed by software or hardware.
  • the multiply-accumulate operation executed at each node of FIG. 5A, the filter processing executed at the convolution layer of the CNN, and the like may be executed by software.
  • the above calculation may be executed by a circuit device such as FPGA.
  • the above calculation may be executed by a combination of software and hardware.
  • the operation of the processing unit 220 according to the command from the trained model can be realized by various aspects.
  • a trained model includes an inference algorithm and parameters used in the inference algorithm.
  • the inference algorithm is an algorithm that performs filter operations and the like based on input data.
  • the parameter is a parameter acquired by the learning process, and is, for example, a weighting coefficient.
  • both the inference algorithm and the parameters are stored in the storage unit 230, and the processing unit 220 may perform the inference processing by software by reading the inference algorithm and the parameters.
  • the inference algorithm may be realized by FPGA or the like, and the storage unit 230 may store the parameters.
  • an inference algorithm including parameters may be realized by FPGA or the like.
  • the storage unit 230 that stores the information of the trained model is, for example, the built-in memory of the FPGA.
  • the image to be processed in this embodiment is an in-vivo image captured by an endoscopic imaging device.
  • the endoscope image pickup device is an image pickup device provided in the endoscope system 300 and capable of outputting an imaging result of a subject image corresponding to a living body, and corresponds to an image pickup element 312 in a narrow sense.
  • the first observation method is an observation method in which normal light is used as illumination light
  • the second observation method is an observation method in which special light is used as illumination light.
  • the first observation method may be an observation method in which normal light is used as illumination light
  • the second observation method may be an observation method in which dye is sprayed on the subject. In this way, even if the observation method is changed by spraying the coloring material on the subject, it is possible to suppress variations in sensitivity and overdetection due to the change.
  • Special light observation and dye spray observation can improve the visibility of a specific subject as compared with normal light observation, so there is a great advantage in using them together with normal light observation.
  • the processing unit 220 has a first classification score indicating the certainty that the processed image is captured in the first observation method, and a certainty that the processed image is captured in the second observation method.
  • the second classification score which expresses the peculiarity, is obtained.
  • the trained model is an observation showing a learning image captured by the first observation method or the second observation method and whether the learning image is an image captured by the first observation method or the second observation method. It is a model acquired by method data and machine learning based on it.
  • the image processing system 200 of the present embodiment may further include a storage unit 230 that stores a first threshold value corresponding to the first observation method and a second threshold value corresponding to the second observation method.
  • the processing unit 220 sets the first threshold value as the threshold value when the first classification score is larger than the second classification score.
  • the processing unit 220 sets the second threshold value as the threshold value when the second classification score is larger than the first classification score.
  • the processing unit 220 may set the first threshold value as the threshold value or the second threshold value as the threshold value.
  • the first threshold value here is a threshold value acquired by using the image captured in the first observation method as an evaluation image, and is, for example, Th1 described above.
  • the second threshold value is a threshold value obtained by using the image captured in the second observation method as an evaluation image, and is, for example, Th2. In this way, by selecting the threshold value based on the magnitude relationship of the classification score, it is possible to set the threshold value suitable for the observation method of the image to be processed.
  • the threshold setting in this embodiment is not limited to this.
  • the processing unit 220 may set the threshold value by weighting and adding the first threshold value and the second threshold value using the weights based on the first classification score and the second classification score.
  • the threshold setting unit 222 sets the threshold value based on the following equation (1).
  • Set Th. Th SC1 x Th1 + SC2 x Th2 ... (1)
  • both the first classification score and the second classification score are in the vicinity of 0.5, it is difficult to determine whether the image to be processed is a normal light image or a special light image.
  • Th1 itself corresponding to a normal light image is set as a threshold value, that point is not taken into consideration even though the image to be processed includes some image features similar to those of the special light image.
  • the sensitivity and overdetection rate may vary.
  • Th2 itself corresponding to the special light image is set as the threshold value.
  • the above equation (1) is an example of weighting addition, and the threshold value may be obtained by different operations.
  • the classification score itself which is the probability data, is used as the weight in the weighting addition, but the processing is not limited to this.
  • the weight may be determined by preparing table data in which the first classification score and the second classification score are associated with the weight in the weighting addition and referring to the table data.
  • the image processing system 200 may be able to switch between a plurality of detection modes.
  • the storage unit 230 stores the threshold value according to the detection mode and the observation method.
  • the storage unit 230 has a threshold value Th11 suitable for sensitivity priority mode and normal light observation, Th12 suitable for overdetection suppression mode and normal light observation, and a threshold value Th21 suitable for sensitivity priority mode and special light observation.
  • Th22 which is suitable for over-detection suppression mode and special light observation, may be stored.
  • the threshold value setting unit 222 sets the threshold value based on the current detection mode and the classification score output from the observation method classification unit 221. For example, when the detection mode is the sensitivity priority mode, the threshold value setting unit 222 sets the threshold value based on Th11, Th21, the first classification score, and the second classification score. Specifically, the threshold value setting unit 222 may select either Th11 or Th21 as described above, or may perform weighting addition. When the detection mode is the over-detection suppression mode, the threshold value setting unit 222 sets the threshold value based on Th12, Th22, the first classification score, and the second classification score.
  • the method of the present embodiment only needs to be able to execute a consistent detection process for each detection mode, and the number of detection modes may be one or two or more.
  • the detection mode may be determined by user input or may be automatically determined by the system.
  • the threshold value setting unit 222 can acquire information for specifying the detection mode.
  • observation method classifier of the present embodiment may consist of a convolutional neural network. In this way, the observation method classification process for the image can be executed efficiently and with high accuracy.
  • region of interest detector of this embodiment may be CNN. In this way, it is possible to efficiently and highly accurately execute the detection process using the image as an input.
  • the endoscope system 300 includes an imaging unit that captures an in-vivo image, an image acquisition unit that acquires an in-vivo image as a processing target image, and a processing unit that performs processing on the processing target image.
  • the image pickup unit in this case is, for example, an image pickup device 312.
  • the image acquisition unit is, for example, an A / D conversion unit 331.
  • the processing unit is, for example, a pre-processing unit 332, a detection processing unit 333, a post-processing unit 334, and the like. It is also possible to think that the image acquisition unit corresponds to the A / D conversion unit 331 and the preprocessing unit 332, and the specific configuration can be modified in various ways.
  • the processing unit of the endoscope system 300 includes a first classification score indicating the certainty that the processed image is captured by the first observation method, and a second classification score indicating the certainty that the processed image is captured by the second observation method. Find the classification score.
  • the processing unit detects a region of interest in the image to be processed, and obtains a detection score indicating the certainty of the detected region of interest. Then, the processing unit sets a threshold value based on the first classification score and the previous two classification scores, compares the set threshold value with the detection score, and when the detection score is larger than the threshold value, the detection result of the region of interest. Is output.
  • the processing performed by the image processing system 200 of the present embodiment may be realized as an image processing method.
  • the image to be processed is acquired, the first classification score indicating the certainty that the image to be processed is captured in the first observation method, and the image to be processed are captured in the second observation method.
  • the second classification score indicating the certainty is obtained, the region of interest is detected in the image to be processed, the detection score representing the certainty of the detected region of interest is obtained, and based on the first classification score and the second classification score.
  • a threshold value is set, the set threshold value is compared with the detection score, and when the detection score is larger than the threshold value, the detection result of the region of interest is output.
  • observation method classifier executes only the observation method classification process.
  • the observation method classifier may execute the detection process of the region of interest in addition to the observation method classification process.
  • the first observation method is normal light observation and the second observation method is special light observation will be described, but the second observation method may be dye spray observation. ..
  • the learning unit 120 of the present embodiment is not divided into the detection learning unit 121 and the observation method classification learning unit 122, and performs a process of generating an observation method classifier that performs both the detection process and the observation method classification process.
  • the observation method classifier of the second embodiment is also referred to as a detection integrated observation method classifier.
  • a detection-integrated observation method classifier for example, a CNN for detecting a region of interest and a CNN for classifying an observation method share a feature extraction layer for extracting features while repeating convolution, pooling, and nonlinear activation processing, and detect from the feature extraction layer.
  • a configuration that is divided into the output of the result and the output of the observation method classification result is used.
  • FIG. 10 is a diagram showing a configuration of a neural network of a detection integrated observation method classifier.
  • the CNN which is a detection-integrated observation method classifier, includes a feature amount extraction layer, a detection layer, and an observation method classification layer.
  • Each of the rectangular regions in FIG. 10 represents a layer that performs some calculation such as a convolution layer, a pooling layer, and a fully connected layer.
  • the configuration of the CNN is not limited to FIG. 10, and various modifications can be performed.
  • the feature amount extraction layer accepts the image to be processed as an input and outputs the feature amount by performing an operation including a convolution operation and the like.
  • the detection layer takes the feature amount output from the feature amount extraction layer as an input, and outputs information representing the detection result.
  • the output of the detection layer is, for example, a detection frame and a detection score associated with the detection frame.
  • the observation method classification layer receives the feature amount output from the feature amount extraction layer as an input, and outputs information representing the observation method classification result.
  • the output of the observation method classification layer is, for example, a first classification score and a second classification score.
  • the learning device 100 executes a learning process for determining weighting coefficients in each of the feature amount extraction layer, the detection layer, and the observation method classification layer.
  • the learning unit 120 of the present embodiment includes a learning image in which detection data and observation method data are added as correct answer data to a normal light image, and learning in which detection data and observation method data are added to a special light image.
  • a detection-integrated observation method classifier is generated by performing learning processing based on an image group including an image for use.
  • the learning unit 120 takes a normal light image or a special light image included in the image group as an input and performs a forward calculation based on the current weighting coefficient.
  • the observation method classification learning unit 122 calculates the error between the result obtained by the forward calculation and the correct answer data as an error function, and updates the weighting coefficient so as to reduce the error function.
  • the learning unit 120 obtains the weighted sum of the error between the output of the detection layer and the detection data and the error between the output of the observation method classification layer and the observation method data as an error function.
  • all of the weighting coefficient in the feature amount extraction layer, the weighting coefficient in the detection layer, and the weighting coefficient in the observation method classification layer become learning targets.
  • FIG. 11 is a configuration example of the image processing system 200 according to the second embodiment.
  • the processing unit 220 of the image processing system 200 includes a detection classification unit 225, a threshold value setting unit 222, and an output processing unit 224.
  • the detection classification unit 225 outputs the detection result and the observation method classification result based on the detection integrated observation method classifier generated by the learning device 100.
  • the threshold value setting unit 222 and the output processing unit 224 are the same as those in the first embodiment.
  • FIG. 12 is a flowchart illustrating the processing of the image processing system 200 in the second embodiment.
  • the image acquisition unit 210 acquires an in-vivo image captured by the endoscope imaging device as a processing target image.
  • step S202 the detection classification unit 225 performs a forward calculation using the processing target image acquired by the image acquisition unit 210 as an input of the detection integrated observation method classifier.
  • the detection classification unit 225 acquires the information representing the detection result from the detection layer and the information representing the observation method classification result from the observation method classification layer. Specifically, the detection classification unit 225 acquires the detection frame, the detection score, the first classification score, and the second classification score.
  • steps S203 to S206 are the same as that of steps S104 to S107 of FIG. That is, in steps S203 to S205, the threshold value setting unit 222 sets the threshold value based on the first classification score and the second classification score. In step S206, the output processing unit 224 outputs the detection result based on the detection score and the set threshold value. However, it differs from the first embodiment in that the detection frame and the detection score are the information output by the detection integrated observation method classifier.
  • the processing unit 220 in the present embodiment obtains the first classification score, the second classification score, and the detection score by operating according to the trained model.
  • the trained model is a model acquired by machine learning based on the learning image captured by the first observation method or the second observation method and the correct answer data, and the correct answer data is the region of interest in the training image.
  • the observation method classifier can also serve as a detector for the region of interest.
  • the configuration shown in FIG. 10 it becomes possible to standardize the feature amount extraction in the detection process and the feature amount extraction in the observation method classification process. Therefore, it is possible to reduce the size of the trained model as compared with the case where each feature quantity sampling layer is provided.
  • the storage unit 230 of the image processing system 200 stores the weighting coefficient of the trained model, the capacity of the storage unit 230 can be reduced.
  • an inference processing algorithm according to the trained model is configured by using an FPGA or the like, the size of the FPGA can be reduced.
  • the first observation method is normal light observation and the second observation method is special light observation or dye spray observation has been described.
  • the observation method is not limited to two.
  • three observation methods may be used: normal light observation, special light observation, and dye spray observation.
  • the observation method is not limited to normal light observation, special light observation, and dye spray observation.
  • the observation method of the present embodiment includes a water supply observation method, which is an observation method in which an image is taken while a water supply operation for discharging water from the insertion portion is performed, and an air supply operation for discharging gas from the insertion portion.
  • air supply observation which is an observation method for imaging in a state
  • bubble observation which is an observation method for imaging a subject with bubbles attached
  • residue observation which is an observation method for imaging a subject with residues, and the like.
  • the combination of observation methods can be flexibly changed, and two or more of normal light observation, special light observation, dye spray observation, water supply observation, air supply observation, bubble observation, and residue observation can be arbitrarily combined. Further, an observation method other than the above may be used.
  • the observation method classifier When an observation method according to N (N is an integer of 3 or more) is assumed, the observation method classifier outputs the first to Nth classification scores.
  • the i-th classification score is data representing the certainty that the image input to the observation method classifier was captured in the i-th observation method.
  • i is an integer of 1 or more and N or less.
  • the 1st to Nth classification scores are probability data in which the total is 1.
  • the storage unit 230 stores threshold values Th1 to ThN suitable for each of the first to Nth observation methods.
  • the threshold value setting unit 222 sets the threshold value based on the first to Nth classification scores, which are the outputs of the observation method classifier, and the threshold values Th1 to ThN.
  • the threshold value setting unit 222 may select any one of Th1 to ThN as a threshold value based on the classification score having the maximum value among the first to Nth classification scores, or the first to third classification scores.
  • the threshold value may be calculated by weighting and adding the N classification score and Th1 to ThN.
  • FIG. 13 is an example of the threshold value corresponding to each observation method stored in the storage unit 230.
  • the storage unit 230 can be used as a threshold for realizing the sensitivity priority mode in each observation method of normal light observation, special light observation, dye spray observation, water supply observation, air supply observation, bubble observation, and residue observation. Seven suitable thresholds Th11 to Th71 are stored.
  • the storage unit 230 has seven threshold values suitable for the normal light observation, the special light observation, the dye spray observation, the water supply observation, the air supply observation, the bubble observation, and the residue observation as the threshold values for realizing the overdetection suppression mode.
  • the threshold values Th12 to Th72 are stored. That is, when the number of detection modes is M (M is an integer of 1 or more) and the number of observation methods is N, the storage unit 230 stores N ⁇ M threshold values Th11 to ThNM.
  • the threshold setting unit 222 selects N thresholds out of N ⁇ M thresholds by specifying the detection mode. For example, when the j-th detection mode (j is an integer of 1 or more and M or less) is realized, the threshold value setting unit 222 selects Th1j to ThNj. Then, the threshold value is set based on the first to Nth classification scores and the threshold values Th1j to ThNj.
  • the number of observation methods can be expanded to 3 or more.
  • the number of detection modes is not limited to 1 or 2, and may be expanded to 3 or more. In this way, even when a variety of observation methods are targeted, it is possible to realize a consistent detection process regardless of the observation method.
  • a diagnosis step by a doctor can be considered as a step of searching for a lesion by using normal light observation and a step of distinguishing the malignancy of the found lesion by using special light observation. Since the special optical image has higher visibility of the lesion than the normal optical image, it is possible to accurately distinguish the malignancy. However, the number of special light images acquired is smaller than that of a normal light image. Therefore, there is a risk that the detection accuracy will decrease due to the lack of training data in machine learning using special optical images.
  • a method of pre-training and fine-tuning is known for lack of training data.
  • the difference in the observation method between the special light image and the normal light image is not taken into consideration.
  • the test image here represents an image that is the target of inference processing using the learning result. That is, the conventional method does not disclose a method for improving the accuracy of the detection process for a special optical image.
  • pre-training is performed using an image group including a normal light image
  • fine tuning is performed using an image group including a special light image.
  • the second observation method may be dye spray observation.
  • the second observation method can be extended to other observation methods in which the detection accuracy may decrease due to the lack of training data.
  • the second observation method may be the above-mentioned air supply observation, water supply observation, bubble observation, residue observation, or the like.
  • FIG. 14 is a configuration example of the learning device 100 of the present embodiment.
  • the learning unit 120 includes a pre-training unit 123 and a fine tuning unit 124.
  • the pre-training unit 123 acquires the image group B1 from the image acquisition unit 110 and performs machine learning based on the image group B1 to perform pre-training of the detection integrated observation method classifier.
  • the image group B1 includes a learning image in which detection data is added to a normal optical image. As described above, ordinary light observation is widely used in the process of searching for a region of interest. Therefore, abundant normal optical images to which the detection data are added can be acquired.
  • the process performed by the pre-training unit 123 using the image group B1 is pre-training for the detection task.
  • the pre-training for the detection task is a learning process for updating the weighting coefficients of the feature amount extraction layer and the detection layer in FIG. 10 by using the detection data as correct answer data. That is, in the pre-training of the detection-integrated observation method classifier, the weighting coefficient of the observation method classification layer is not a learning target.
  • the fine tuning unit 124 performs learning processing using a special light image that is difficult to acquire abundantly.
  • the image group B2 is an image group including a learning image in which detection data and observation method data are added to a normal light image and a learning image in which detection data and observation method data are added to a special light image. is there.
  • the fine-tuning unit 124 generates a detection-integrated observation method classifier by executing a learning process using the image group B2 with the weighting coefficient acquired by pre-training as an initial value. In fine tuning, learning is performed for both the detection task and the observation method classification task, so that all the weighting coefficients of the feature extraction layer, the detection layer, and the observation method classification layer are the learning targets.
  • the processing after the generation of the detection integrated observation method classifier is the same as that of the second embodiment. Further, the method of the fourth embodiment and the method of the third embodiment may be combined. That is, when three or more observation methods including normal light observation are used, it is possible to combine pretraining using normal light images and fine tuning using captured images in an observation method in which the number of images to be imaged is insufficient. is there.
  • the observation method classifier and the region of interest detector may be separate bodies.
  • the region of interest detector is generated by performing pretraining using a normal light image and fine tuning using a normal light image and a special light image.
  • an observation method classifier is generated by performing pretraining for the detection task using normal optical images and performing fine tuning for the observation method classification task by diverting the feature sampling layer after pretraining. May be good.
  • the trained model is pretrained using the first image group including the images captured by the first observation method, and after the pretraining, the images captured by the first observation method and the second observation method. It may be a model learned by fine tuning using a second image group including the images captured in.
  • the trained model here is specifically a detection-integrated observation method classifier.
  • the first image group corresponds to the image group B1 and is an image group including a plurality of learning images to which detection data is added to a normal optical image.
  • the second image group corresponds to the image group B2, and the learning image to which the detection data and the observation method data are added to the normal light image and the learning to which the detection data and the observation method data are added to the special light image. It is a group of images including an image for use. When there are three or more observation methods, the second image group includes learning images captured by each observation method of the plurality of observation methods.
  • pre-training is performed in order to make up for the shortage of the number of learning images.
  • pre-training is a process of setting an initial value of a weighting coefficient when performing fine tuning. As a result, the accuracy of the detection process can be improved as compared with the case where the pre-training is not performed.
  • Illumination lens 315 ... Light guide, 316 ... AF start / end button, 320 ... External I / F unit, 330 ... System control device, 331 ... A / D conversion unit, 332 ... pre-processing unit, 333 ... detection processing unit, 334 ... post-processing unit, 335 ... system control unit, 336 ... control unit, 337 ... storage unit, 340 ... display unit , 350 ... Light source device, 352 ... Light source

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Surgery (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Biomedical Technology (AREA)
  • Optics & Photonics (AREA)
  • Pathology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Biophysics (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un système de traitement d'image (200) qui comprend : une unité d'acquisition d'image (210) qui acquiert une image à traiter ; et une unité de traitement (220) qui effectue un traitement sur l'image à traiter. L'unité de traitement (220) détermine un premier score de classification représentant la probabilité que l'image à traiter a été capturée à l'aide d'un premier procédé d'observation, et un second score de classification représentant la probabilité que l'image à traiter a été capturée à l'aide d'un second procédé d'observation ; détecte une zone de focalisation dans l'image à traiter ; détermine un score de détection représentant la probabilité de la zone de focalisation détectée ; règle une valeur seuil sur la base du premier score de classification et du second score de classification ; compare la valeur seuil établie au score de détection ; et, si le score de détection est supérieur à la valeur seuil, délivre un résultat de détection de zone de focalisation.
PCT/JP2020/000376 2020-01-09 2020-01-09 Dispositif de traitement d'image, système d'endoscope et procédé de traitement d'image WO2021140601A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/000376 WO2021140601A1 (fr) 2020-01-09 2020-01-09 Dispositif de traitement d'image, système d'endoscope et procédé de traitement d'image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/000376 WO2021140601A1 (fr) 2020-01-09 2020-01-09 Dispositif de traitement d'image, système d'endoscope et procédé de traitement d'image

Publications (1)

Publication Number Publication Date
WO2021140601A1 true WO2021140601A1 (fr) 2021-07-15

Family

ID=76788169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/000376 WO2021140601A1 (fr) 2020-01-09 2020-01-09 Dispositif de traitement d'image, système d'endoscope et procédé de traitement d'image

Country Status (1)

Country Link
WO (1) WO2021140601A1 (fr)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002083301A (ja) * 2000-09-06 2002-03-22 Mitsubishi Electric Corp 交通監視装置
JP2007140823A (ja) * 2005-11-17 2007-06-07 Omron Corp 顔照合装置、顔照合方法及びプログラム
JP2009518982A (ja) * 2005-12-08 2009-05-07 クゥアルコム・インコーポレイテッド 適応性自動ホワイト・バランス
JP2011161046A (ja) * 2010-02-10 2011-08-25 Olympus Corp 蛍光内視鏡装置
WO2012147820A1 (fr) * 2011-04-28 2012-11-01 オリンパス株式会社 Dispositif d'observation fluorescent et son procédé d'affichage d'images
JP2013056001A (ja) * 2011-09-07 2013-03-28 Olympus Corp 蛍光観察装置
JP2016015116A (ja) * 2014-06-12 2016-01-28 パナソニックIpマネジメント株式会社 画像認識方法、カメラシステム
WO2016110984A1 (fr) * 2015-01-08 2016-07-14 オリンパス株式会社 Dispositif de traitement d'image, procédé de fonctionnement de dispositif de traitement d'image, programme de fonctionnement de dispositif de traitement d'image, et dispositif d'endoscope
WO2020003991A1 (fr) * 2018-06-28 2020-01-02 富士フイルム株式会社 Dispositif, procédé et programme d'apprentissage d'image médicale

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002083301A (ja) * 2000-09-06 2002-03-22 Mitsubishi Electric Corp 交通監視装置
JP2007140823A (ja) * 2005-11-17 2007-06-07 Omron Corp 顔照合装置、顔照合方法及びプログラム
JP2009518982A (ja) * 2005-12-08 2009-05-07 クゥアルコム・インコーポレイテッド 適応性自動ホワイト・バランス
JP2011161046A (ja) * 2010-02-10 2011-08-25 Olympus Corp 蛍光内視鏡装置
WO2012147820A1 (fr) * 2011-04-28 2012-11-01 オリンパス株式会社 Dispositif d'observation fluorescent et son procédé d'affichage d'images
JP2013056001A (ja) * 2011-09-07 2013-03-28 Olympus Corp 蛍光観察装置
JP2016015116A (ja) * 2014-06-12 2016-01-28 パナソニックIpマネジメント株式会社 画像認識方法、カメラシステム
WO2016110984A1 (fr) * 2015-01-08 2016-07-14 オリンパス株式会社 Dispositif de traitement d'image, procédé de fonctionnement de dispositif de traitement d'image, programme de fonctionnement de dispositif de traitement d'image, et dispositif d'endoscope
WO2020003991A1 (fr) * 2018-06-28 2020-01-02 富士フイルム株式会社 Dispositif, procédé et programme d'apprentissage d'image médicale

Similar Documents

Publication Publication Date Title
JP7104810B2 (ja) 画像処理システム、学習済みモデル及び画像処理方法
WO2021140602A1 (fr) Système de traitement d'image, dispositif d'apprentissage et procédé d'apprentissage
WO2021140600A1 (fr) Système de traitement d'image, système d'endoscope et procédé de traitement d'image
Iqbal et al. Recent trends and advances in fundus image analysis: A review
US12026935B2 (en) Image processing method, training device, and image processing device
JP2021532881A (ja) マルチスペクトル情報を用いた拡張画像化のための方法およびシステム
JP2021532891A (ja) マルチスペクトル情報を用いた観血的治療における拡張画像化のための方法およびシステム
JP7278202B2 (ja) 画像学習装置、画像学習方法、ニューラルネットワーク、及び画像分類装置
JP7005767B2 (ja) 内視鏡画像認識装置、内視鏡画像学習装置、内視鏡画像学習方法及びプログラム
JP7304951B2 (ja) コンピュータプログラム、内視鏡用プロセッサの作動方法及び内視鏡用プロセッサ
WO2020008834A1 (fr) Dispositif de traitement d'image, procédé et système endoscopique
JP6952214B2 (ja) 内視鏡用プロセッサ、情報処理装置、内視鏡システム、プログラム及び情報処理方法
WO2021181520A1 (fr) Système de traitement d'images, dispositif de traitement d'images, système d'endoscope, interface et procédé de traitement d'images
US20230005247A1 (en) Processing system, image processing method, learning method, and processing device
WO2021229684A1 (fr) Système de traitement d'image, système d'endoscope, procédé de traitement d'image et procédé d'apprentissage
JP2022055953A (ja) 欠陥分類装置、欠陥分類方法及びプログラム
Zhang et al. Detection and segmentation of multi-class artifacts in endoscopy
WO2021140601A1 (fr) Dispositif de traitement d'image, système d'endoscope et procédé de traitement d'image
JP2016122905A (ja) 画像処理装置、画像処理方法及びプログラム
Dayana et al. A comprehensive review of diabetic retinopathy detection and grading based on deep learning and metaheuristic optimization techniques
US20230100147A1 (en) Diagnosis support system, diagnosis support method, and storage medium
JP7162744B2 (ja) 内視鏡用プロセッサ、内視鏡システム、情報処理装置、プログラム及び情報処理方法
WO2022097294A1 (fr) Système de traitement d'informations, système d'endoscope et procédé de traitement d'informations
KR20230059244A (ko) 인공지능 기반의 내시경 진단 보조 시스템 및 이의 제어방법
WO2021044590A1 (fr) Système d'endoscope, système de traitement, procédé de fonctionnement de système d'endoscope et programme de traitement d'image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20912680

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20912680

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP