WO2020008834A1 - Image processing device, method, and endoscopic system - Google Patents

Image processing device, method, and endoscopic system Download PDF

Info

Publication number
WO2020008834A1
WO2020008834A1 PCT/JP2019/023492 JP2019023492W WO2020008834A1 WO 2020008834 A1 WO2020008834 A1 WO 2020008834A1 JP 2019023492 W JP2019023492 W JP 2019023492W WO 2020008834 A1 WO2020008834 A1 WO 2020008834A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
images
light
endoscope
observation
Prior art date
Application number
PCT/JP2019/023492
Other languages
French (fr)
Japanese (ja)
Inventor
慧 内藤
駿平 加門
Original Assignee
富士フイルム株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 富士フイルム株式会社 filed Critical 富士フイルム株式会社
Priority to JP2020528760A priority Critical patent/JP7289296B2/en
Publication of WO2020008834A1 publication Critical patent/WO2020008834A1/en
Priority to JP2022168121A priority patent/JP2022189900A/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/04Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor combined with photographic or television appliances
    • A61B1/045Control thereof
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B23/00Telescopes, e.g. binoculars; Periscopes; Instruments for viewing the inside of hollow bodies; Viewfinders; Optical aiming or sighting devices
    • G02B23/24Instruments or systems for viewing the inside of hollow bodies, e.g. fibrescopes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to an image processing apparatus, an image processing method, and an endoscope system, and more particularly, to a technology that can be used for assisting a doctor in endoscopy.
  • Patent Literature 1 discloses an acquisition unit that acquires a plurality of images of cells photographed in a time series, and acquires a plurality of images in a time series with respect to each of one or more predetermined evaluation items.
  • an information processing apparatus including an assigning unit that assigns an evaluation value along the evaluation unit and an evaluation unit that evaluates a cell based on a time change of the evaluation value along the assigned time series.
  • the evaluation unit assigns an evaluation value along the time series to a plurality of images of the cells photographed along the time series according to a machine learning algorithm, and based on a time change of the assigned evaluation value.
  • the cells to be observed are evaluated. This enables an evaluation that comprehensively considers the time-series evaluation of cells.
  • time-series input data which is a data sequence in a moving image is acquired, and a plurality of input values corresponding to input data at one time point in the time-series input data correspond to the time-series input data. It is supplied to a plurality of nodes of a trained model (a model constituting a Boltzmann machine), and a plurality of nodes corresponding to the input data series before the prediction target time in the time-series input data and the input data in the input data series in the model Calculating a conditional probability to be each input value corresponding to a prediction target time point under a condition in which an input data sequence occurs, based on a weight parameter between each of the input values and each of the plurality of nodes; Based on the conditional probability of each input value corresponding to the target time point, the next input data becomes a predetermined value under the condition that the time-series input data occurs. Processing apparatus has been proposed to calculate the conditional probability.
  • the processing apparatus can generate a moving image including a total of T images by predicting one image data arrayed at the next time based on T-1 image data arranged in time series. it can.
  • Patent Literature 1 evaluates various changes and the like in a culture process of cells (fertilized eggs) to be imaged, based on a plurality of images imaged in time series.
  • the images are captured under the same imaging conditions. This is because, unless the images are taken under the same imaging conditions, it is not possible to evaluate the change in the fertilized egg from the plurality of acquired images. That is, the plurality of images are not images sequentially acquired using different observation lights.
  • the processing device described in Patent Literature 2 enables prediction of image data at the next time by using a trained model that inputs time-series input data.
  • the time-series input data is captured under the same imaging conditions. It is a thing. This is because if the image is not captured under the same imaging condition, the image data at the next time cannot be predicted from the input time-series input data. That is, the time-series input data is not input data sequentially acquired using different observation lights.
  • Patent Documents 1 and 2 each input a plurality of time-series images in order to predict objects (cells, future moving images) that change over time, and improve recognition accuracy in a recognizer. It does not input multiple images for the purpose of improvement.
  • the present invention has been made in view of such circumstances, and can improve the recognition accuracy based on a plurality of images, and an image processing apparatus that can present a good observation image and a recognition result, It is an object to provide a method and an endoscope system.
  • an image processing apparatus receives an image set including a plurality of images sequentially acquired using a plurality of different observation lights and outputs a recognition result for the image set.
  • the image processing apparatus includes a recognizer, and a display control unit that causes the display unit to display the observation image and the recognition result calculated using a part of the plurality of images or the plurality of images.
  • an image set sequentially acquired using a plurality of different observation lights is input, and a recognition result for the image set is acquired, so that one image acquired by one observation light is acquired.
  • the recognition accuracy can be improved as compared with the case where the image is recognized.
  • the recognition result can be appropriately presented.
  • the recognizer has a trained model learned by setting a plurality of images for learning and correct data, and each time a plurality of images for recognition are received. It is preferable to output a recognition result based on the learned model.
  • the learned model is configured by a convolutional neural network.
  • the convolutional neural network is excellent in recognizing images.
  • the plurality of images include a first endoscope image and a second endoscope image acquired using observation light different from the first endoscope image. It is preferred to include.
  • a plurality of images may be acquired using a plurality of different observation lights, and the present invention can be applied to the endoscopy in this case.
  • the first endoscope image is a normal light image captured with normal light
  • the second endoscope image is a special light image captured with special light. It is preferably an image.
  • a normal light image is used as an image for observation
  • a special light image is used when it is desired to observe a surface structure.
  • the special light image includes two or more special light images captured by two or more different special lights. Two or more special light images can be captured according to the observation purpose such as when the depth of the surface structure to be observed is different.
  • the first endoscope image is a first special light image captured with the first special light
  • the second endoscope image is a first special light image
  • 3 is a second special light image captured by a second special light different from the second special light. That is, the plurality of endoscope images may not include the normal light image.
  • the display control unit causes the display unit to display an observation image calculated using a part of the plurality of images or the plurality of images as a moving image. .
  • the display control unit causes the display unit to display an observation image calculated using a part of the plurality of images or the plurality of images as a moving image.
  • the recognizer recognizes a region of interest included in the plurality of images
  • the display control unit displays an index indicating the recognized region of interest on the display unit. It is preferable to superimpose and display the image on the image. This makes it possible to support the inspection so that the attention area in the observation image is not overlooked.
  • the recognizer recognizes a region of interest included in the plurality of images
  • the display control unit displays information indicating presence / absence of the region of interest on the display unit. It is preferable that the display is made so as not to overlap with. Accordingly, it is possible to notify that the attention area exists in the observation image, and to prevent the observation of the observation image from being disturbed by the information displayed on the display unit.
  • the recognizer executes the discrimination regarding the lesion based on the plurality of images and outputs the discrimination result, and the display control unit displays the discrimination result on the display unit. Is preferred. This makes it possible to visually inspect the observation image while referring to the discrimination result obtained by the recognizer.
  • An endoscope system includes a light source device that sequentially generates a first observation light and a second observation light different from the first observation light, and a first observation light and a second observation light sequentially.
  • An endoscope scope that captures a plurality of images by sequentially capturing the illuminated observation target, a display unit, and the image processing device described above, and a recognizer includes a plurality of image capture units that the endoscope scope captures. An image set including images is received.
  • An endoscope system includes an endoscope processor that receives a plurality of images captured by an endoscope and performs image processing on the plurality of images. It is preferable to receive a plurality of images after image processing by the processor.
  • the endoscope processor has a function of performing image processing on a plurality of images captured by the endoscope scope, and the recognizer can detect and discriminate a lesion area using the plurality of images after the image processing. it can.
  • the recognizer may be separate from the endoscope processor, or may be built in the endoscope processor.
  • An image processing method includes a first step of receiving an image set including a plurality of images acquired using a plurality of different observation lights, and a recognizer outputting a recognition result for the image set. And a third step in which the display control unit causes the display unit to display the observation image and the recognition result calculated using a part of the plurality of images or the plurality of images. To the third step are repeatedly executed.
  • the recognizing device having the trained model learned from the learning image set and the correct data receives the learning each time the recognition image set is received. It is preferable to output a recognition result based on the completed model.
  • the learned model is configured by a convolutional neural network.
  • the plurality of images include a first endoscope image and a second endoscope image acquired using observation light different from the first endoscope image. It is preferred to include.
  • the first endoscope image is a normal light image captured with normal light
  • the second endoscope image is a special light image captured with special light. It is preferably an image.
  • recognition of an image set is performed based on an image set including a plurality of images sequentially acquired using a plurality of different observation lights, so that recognition accuracy can be improved.
  • recognition result can be appropriately presented.
  • FIG. 1 is a perspective view showing an appearance of an endoscope system 10 according to the present invention.
  • FIG. 2 is a block diagram illustrating an electrical configuration of the endoscope system 10.
  • FIG. 3 is a diagram illustrating an example of a multi-frame image and an image set mainly captured in the multi-frame shooting mode.
  • FIG. 4 is a schematic diagram showing a typical configuration example of a convolutional neural network which is one of the learning models constituting the recognizer 15.
  • FIG. 5 is a schematic diagram showing a configuration example of the intermediate layer 15B of the CNN 15 shown in FIG.
  • FIG. 6 is a block diagram showing a main configuration used for explaining the operation of the endoscope system 10 according to the present invention.
  • FIG. 1 is a perspective view showing an appearance of an endoscope system 10 according to the present invention.
  • FIG. 2 is a block diagram illustrating an electrical configuration of the endoscope system 10.
  • FIG. 3 is a diagram illustrating an example of a multi-frame image and an image set mainly
  • FIG. 7 is a diagram illustrating an example of an image set including an R image, a G image, a B image, and a V image captured in a frame sequential manner.
  • FIG. 8 is a flowchart showing an embodiment of the image processing method according to the present invention.
  • FIG. 1 is a perspective view showing an appearance of an endoscope system 10 according to the present invention.
  • an endoscope system 10 mainly includes an endoscope (here, a flexible endoscope) 11 for imaging an observation target in a subject, a light source device 12, an endoscope processor 13, It comprises a display unit (display) 14 such as a liquid crystal monitor and a recognizer 15.
  • an endoscope here, a flexible endoscope
  • a light source device 12 for imaging an observation target in a subject
  • an endoscope processor 13 It comprises a display unit (display) 14 such as a liquid crystal monitor and a recognizer 15.
  • the light source device 12 supplies the endoscope 11 with various kinds of observation light such as white light for capturing a normal light image and light in a specific wavelength band for capturing a special light image.
  • the endoscope processor 13 has an image processing function of generating image data of a normal light image for display / recording, a special light image, or an image for observation based on an image signal obtained by the endoscope 11, and a light source. It has a function of controlling the device 12, a function of displaying a normal image or an image for observation, and a function of displaying a recognition result by the recognizer 15 on the display 14. Although the details of the recognizer 15 will be described later, the endoscope processor 13 accepts an endoscopic image, detects the position of a region of interest (lesion, surgical scar, treatment scar, treatment tool, etc.) with respect to the endoscopic image, and detects the lesion. It is a part that performs recognition such as discrimination of the type.
  • the display 14 displays a normal image, a special light image or an image for observation, and a recognition result by the recognizer 15 based on display image data input from the endoscope processor 13.
  • the endoscope 11 is connected to a flexible insertion portion 16 to be inserted into a subject and a base end of the insertion portion 16, and is used for gripping the endoscope 11 and operating the insertion portion 16. And a universal cord 18 that connects the hand operation unit 17 to the light source device 12 and the endoscope processor 13.
  • the illumination lens 42, the objective lens 44, the imaging element 45, and the like are built in the insertion portion tip 16a, which is the tip of the insertion portion 16 (see FIG. 2).
  • a bending portion 16b that can freely bend is connected to the rear end of the insertion portion front end portion 16a.
  • a flexible tube portion 16c having flexibility is connected to the rear end of the curved portion 16b.
  • the hand operation unit 17 is provided with an angle knob 21, an operation button 22, a forceps inlet 23, and the like.
  • the angle knob 21 is rotated when adjusting the bending direction and the bending amount of the bending portion 16b.
  • the operation button 22 is used for various operations such as air supply / water supply and suction.
  • the forceps inlet 23 communicates with a forceps channel in the insertion section 16.
  • the hand operation unit 17 is provided with an endoscope operation unit 46 (see FIG. 2) for performing various settings.
  • the universal cord 18 incorporates an air / water channel, a signal cable, a light guide, and the like.
  • the distal end of the universal cord 18 is provided with a connector 25a connected to the light source device 12 and a connector 25b connected to the endoscope processor 13. Thereby, observation light is supplied from the light source device 12 to the endoscope scope 11 via the connector section 25a, and an image signal obtained by the endoscope scope 11 is input to the endoscope processor 13 via the connector section 25b. Is done.
  • the light source device 12 is provided with a light source operation unit 12a such as a power button, a lighting button for turning on the light source, and a brightness adjustment button.
  • the endoscope processor 13 includes a power button, a mouse (not shown), and the like. Is provided with a processor operation unit 13a including an input unit for receiving an input from the pointing device.
  • the endoscope processor 13 and the light source device 12 of this example are of a separate type, the endoscope processor may be of a type with a built-in light source device.
  • FIG. 2 is a block diagram illustrating an electrical configuration of the endoscope system 10.
  • the endoscope scope 11 is roughly divided into a light guide 40, an illumination lens 42, an objective lens 44, an image sensor 45, an endoscope operation unit 46, and an endoscope control unit 47.
  • ROM Read Only Memory
  • the light guide 40 uses a large-diameter optical fiber, a bundle fiber, or the like.
  • the light guide 40 has an incident end inserted into the light source device 12 via the connector portion 25a, and an emission end thereof passes through the insertion portion 16 and faces an illumination lens 42 provided in the insertion portion distal end portion 16a. ing.
  • the illumination light supplied from the light source device 12 to the light guide 40 is applied to the observation target through the illumination lens 42. Then, the illumination light reflected and / or scattered by the observation target enters the objective lens 44.
  • the objective lens 44 forms reflected light or scattered light (that is, an optical image of an observation target) of the incident illumination light on the imaging surface of the imaging element 45.
  • the image sensor 45 is a complementary metal oxide semiconductor (CMOS) or charge coupled device (CCD) type image sensor, and is positioned and fixed relatively to the objective lens 44 at a position deeper than the objective lens 44.
  • CMOS complementary metal oxide semiconductor
  • CCD charge coupled device
  • a plurality of pixels constituted by a plurality of photoelectric conversion elements (photodiodes) for photoelectrically converting an optical image are two-dimensionally arranged on an imaging surface of the imaging element 45.
  • red (R), green (G), and blue (B) color filters are arranged for each pixel on the incident surface side of the plurality of pixels of the image sensor 45 of the present example, whereby the R pixel and the G pixel are arranged.
  • B pixels The filter array of the RGB color filters is generally a Bayer array, but is not limited to this.
  • the imaging element 45 converts the optical image formed by the objective lens 44 into an electric image signal and outputs it to the endoscope processor 13.
  • an A / D (Analog / Digital) converter is built in, and a digital image signal is directly output from the image sensor 45 to the endoscope processor 13. You.
  • an image signal output from the image sensor 45 is converted into a digital image signal by an A / D converter or the like (not shown). Is output.
  • the endoscope operation unit 46 has a shooting mode setting unit that sets any one of a still image shooting button (not shown), a normal light image shooting mode, a special light image shooting mode, and a multi-frame shooting mode. ing.
  • the photographing mode setting unit may be provided in the processor operation unit 13a of the endoscope processor 13.
  • the endoscope control unit 47 sequentially executes various programs and data read from the ROM 48 or the like in accordance with an operation on the endoscope operation unit 46, and mainly controls the driving of the imaging element 45.
  • the endoscope control unit 47 controls the image sensor 45 to read out the signals of the R pixel, the G pixel, and the B pixel of the image sensor 45, and performs the special light image capturing mode or the multi In the frame photographing mode, when the V-LED 32a emits purple light as the observation light in order to acquire a specific special light image, or when the B-LED 32b emits blue light, these purple lights are emitted.
  • Only the signal of the B pixel of the image sensor 45 having the spectral sensitivity in the wavelength band of light and blue light is read out, or one of the three color pixels of the R pixel, the G pixel, and the B pixel, or 2
  • the image sensor 45 is controlled to read out one color pixel.
  • the endoscope control unit 47 communicates with the processor control unit 61 of the endoscope processor 13, and operates the endoscope scope 11 stored in the ROM 48 and the operation information of the endoscope operation unit 46.
  • the identification information for identifying the type is transmitted to the endoscope processor 13.
  • the light source device 12 has a light source control unit 31 and a light source unit 32.
  • the light source control unit 31 performs communication between the control of the light source unit 32 and the processor control unit 61 of the endoscope processor 13 to exchange various information.
  • the light source unit 32 has, for example, a plurality of semiconductor light sources.
  • the light source unit 32 includes a V-LED (Violet Light Emitting Diode) 32a, a B-LED (Blue Light Emitting Diode) 32b, a G-LED (Green Light Emitting Diode) 32c, and an R-LED (Red Light).
  • Emitting (Diode) 32d has four color LEDs.
  • the V-LED 32a, the B-LED 32b, the G-LED 32c, and the R-LED 32d are, for example, observation lights having peak wavelengths at 410 nm, 450 nm, 530 nm, and 615 nm, respectively, and include violet (V) light, blue (B) light, It is a semiconductor light source that emits green (G) light and red (R) light.
  • the light source control unit 31 individually controls the on / off of the four LEDs of the light source unit 32, the light emission amount at the time of lighting, and the like for each LED according to the shooting mode set by the shooting mode setting unit.
  • the light source control unit 31 turns on all of the V-LED 32a, the B-LED 32b, the G-LED 32c, and the R-LED 32d. Therefore, in the normal light image capturing mode, white light including V light, B light, G light, and R light is used as observation light.
  • the light source control unit 31 turns on any one of the V-LED 32a, the B-LED 32b, the G-LED 32c, and the R-LED 32d, or a plurality of light sources appropriately combined.
  • the light emission amount (light amount ratio) of each light source is controlled, whereby images of a plurality of layers having different depths of the subject can be captured.
  • the multi-frame shooting mode is a shooting mode in which a normal light image and one or more special light images are switched and photographed for each frame, or two or more special light images are switched and photographed for each frame.
  • the light source control unit 31 causes the light source unit 32 to emit different observation light for each frame.
  • each color emitted from each of the LEDs 32a to 32d is incident on a light guide 40 inserted into the endoscope 11 through an optical path coupling portion formed by a dichroic mirror, a lens, and the like, and a diaphragm mechanism (not shown). Is done.
  • the observation light of the light source device 12 may be white light (light in a white wavelength band or light in a plurality of wavelength bands), light having a peak in one or more specific wavelength bands (special light), or a light having a peak in one or more specific wavelength bands. Light of various wavelength bands according to the observation purpose, such as a combination, is selected.
  • a first example of the specific wavelength band is, for example, a blue band or a green band in a visible region.
  • the wavelength band of the first example includes a wavelength band of 390 nm to 450 nm or 530 nm to 550 nm, and the light of the first example has a peak wavelength in the wavelength band of 390 nm to 450 nm or 530 nm to 550 nm. .
  • the second example of the specific wavelength band is, for example, a red band in a visible region.
  • the wavelength band of the second example includes a wavelength band of 585 nm to 615 nm or 610 nm to 730 nm, and the light of the second example has a peak wavelength in the wavelength band of 585 nm to 615 nm or 610 nm to 730 nm. .
  • the third example of the specific wavelength band includes a wavelength band in which the extinction coefficient differs between oxyhemoglobin and reduced hemoglobin, and the light of the third example has a peak wavelength in a wavelength band in which the extinction coefficient differs between oxyhemoglobin and reduced hemoglobin.
  • the wavelength band of the third example includes 400 ⁇ 10 nm, 440 ⁇ 10 nm, 470 ⁇ 10 nm, or a wavelength band of 600 nm or more and 750 nm or less, and the light of the third example includes the above 400 ⁇ 10 nm, 440 ⁇ 10 nm, 470 nm. It has a peak wavelength in a wavelength band of ⁇ 10 nm, or 600 nm to 750 nm.
  • the fourth example of the specific wavelength band is a wavelength band (390 nm to 470 nm) of excitation light used for observation of fluorescence emitted from a fluorescent substance in a living body (fluorescence observation) and for exciting this fluorescent substance.
  • the fifth example of the specific wavelength band is a wavelength band of infrared light.
  • the wavelength band of the fifth example includes a wavelength band of 790 nm to 820 nm or 905 nm to 970 nm, and the light of the fifth example has a peak wavelength in a wavelength band of 790 nm to 820 nm or 905 nm to 970 nm.
  • the endoscope processor 13 includes a processor operation unit 13a, a processor control unit 61, a ROM 62, a digital signal processing circuit (DSP: Digital Signal Processor) 63, an image processing unit 65, a display control unit 66, a storage unit 67, and the like. ing.
  • DSP Digital Signal Processor
  • the processor operation unit 13a includes a power button, an input unit for receiving inputs such as a coordinate position indicated on the screen of the display unit 14 by a mouse and a click (execution instruction).
  • the processor control unit 61 reads out necessary programs and data from the ROM 62 according to the operation information at the processor operation unit 13a and the operation information at the endoscope operation unit 46 received via the endoscope control unit 47, By performing the sequential processing, each part of the endoscope processor 13 is controlled, and the light source device 12 is controlled.
  • the processor control unit 61 may receive a necessary instruction input from another external device such as a keyboard connected via an interface (not shown).
  • the DSP 63 which functions as one mode of an image acquisition unit that acquires image data of each frame of a moving image output from the endoscope 11 (the imaging element 45), is controlled by the processor control unit 61. Performs various signal processing such as defect correction processing, offset processing, white balance correction, gamma correction, and demosaicing processing (also referred to as “simultaneous processing”) on image data for one frame of a moving image input from Image data for one frame is generated.
  • various signal processing such as defect correction processing, offset processing, white balance correction, gamma correction, and demosaicing processing (also referred to as “simultaneous processing”) on image data for one frame of a moving image input from Image data for one frame is generated.
  • the image processing unit 65 receives image data from the DSP 63, performs image processing such as color conversion processing, color emphasis processing, and structure emphasis processing on the input image data as necessary, to capture the observation target.
  • Image data representing an endoscope image is generated.
  • the color conversion process is a process of performing color conversion on image data by 3 ⁇ 3 matrix processing, gradation conversion processing, three-dimensional lookup table processing, or the like.
  • the color emphasis process is a process of emphasizing the color of the image data that has been subjected to the color conversion process, for example, in a direction that makes a difference in the color of blood vessels and mucous membranes.
  • the structure emphasis process is a process of emphasizing a specific tissue or structure included in an observation target such as a blood vessel or a pit pattern, and is performed on image data after the color emphasis process.
  • the image data of each frame of the moving image processed by the image processing unit 65 is recorded in the storage unit 67 as a still image or a moving image instructed to be shot when a shooting instruction of a still image or a moving image is issued.
  • the display control unit 66 generates display data for displaying the normal light image or the special light image on the display unit 14 based on the image data input from the image processing unit 65, and displays the generated display data on the display unit 14. And the display device 14 displays a display image (such as a moving image captured by the endoscope 11).
  • the display control unit 66 sets any one of the plurality of images. (A part of the image) is displayed on the display unit 14, or the observation image calculated by the image processing unit 65 using the plurality of images is displayed on the display unit 14.
  • the display controller 66 causes the display 14 to display a recognition result input from the recognizer 15 via the image processor 65 or a recognition result input from the recognizer 15.
  • the display control unit 66 displays an index indicating the target area so as to be superimposed on the image displayed on the display 14. For example, highlighting such as changing the color of the region of interest in the display image, displaying a marker, and displaying a bounding box can be considered as indices.
  • the display control unit 66 can display information indicating the presence or absence of the attention area based on the detection result of the attention area by the recognizer 15 so as not to overlap the image displayed on the display 14.
  • the information indicating the presence / absence of the attention area includes, for example, changing the color of the frame of the endoscope image between the case where the attention area is detected and the case where the attention area is not detected, and the method of displaying the text “attention area exists!” A mode in which the image is displayed in a display area different from the endoscope image is considered.
  • the display controller 66 causes the display 14 to display the discrimination result.
  • a display method of the discrimination result for example, display of text indicating the detection result on a display image of the display unit 14 or the like can be considered.
  • the display of the text need not be on the display image, and is not particularly limited as long as the correspondence between the text and the display image is understood.
  • the recognizer 15 receives the image after the image processing by the endoscope processor 13. First, the recognition image received by the recognizer 15 will be described.
  • the recognizer 15 of this example is applied when the multi-frame shooting mode is set.
  • the light source device 12 When the multi-frame shooting mode is set, the light source device 12 outputs white light including violet light, blue light, green light, and red light, and V-LED 32a, B-LED 32b, G-LED 32c, and R-LED 32d.
  • Light (special light) of one or a plurality of specific wavelength bands whose lighting is controlled is sequentially generated, and the endoscope processor 13 outputs an image (normal light image) under white light from the endoscope 11 and a special light. Images under light (special light images) are sequentially acquired.
  • a normal light image (WL (White @ Light) image) as the first endoscope image and two types of special light images as the second endoscope image.
  • BLI Blue Light Imaging or Blue LASER Imaging
  • LCI Linked Color Imaging
  • the BLI image and the LCI image are images captured with the BLI observation light and the LCI observation light, respectively.
  • the observation light for BLI is observation light in which the ratio of V light having a high absorptivity in surface blood vessels is high and the ratio of G light having a high absorptivity in middle blood vessels is suppressed. It is suitable for generating an image (BLI image) suitable for structure enhancement.
  • the observation light for LCI has a higher ratio of the V light than the observation light for WL, and is an observation light suitable for capturing a minute change in color tone as compared with the observation light for WL.
  • This is an image that has been subjected to a color enhancement process using the signal of the R component so that the reddish color becomes redder and the whitish color becomes more white, centering on the color near the mucous membrane.
  • the recognizer 15 receives an image set Sa including a plurality of images (in this example, a WL image, a BLI image, and an LCI image) sequentially acquired by the endoscope processor 13 as images for recognition.
  • the recognizer 15 sequentially inputs the image sets Sa. Since the image set Sa is composed of three consecutive frames (WL image, BLI image, and LCI image) in chronological order, each image set Sa Is equivalent to three frames of each frame imaged in the multi-frame shooting mode. That is, the time interval between the image set Sa at the time t n and the image set Sa at the previous time t n ⁇ 1 received by the recognizing unit 15 is equivalent to three frames of each frame captured in the multi-frame shooting mode. Equivalent to time.
  • FIG. 4 is a schematic diagram showing a typical configuration example of a convolutional neural network (CNN: Convolutional Neural Network), which is one of the learning models constituting the recognizer 15.
  • CNN Convolutional Neural Network
  • the CNN 15 is, for example, a learning model for detecting the position of a region of interest (lesion, surgical mark, treatment mark, treatment tool, etc.) in the endoscope image and discriminating the type of lesion, and has a plurality of layer structures. , And holds a plurality of weight parameters.
  • the CNN 15 becomes a trained model when the weight parameter is set to the optimum value, and functions as a recognizer.
  • the CNN 15 includes an input layer 15A, an intermediate layer 15B having a plurality of convolutional layers and a plurality of pooling layers, and an output layer 15C.
  • an input layer 15A an intermediate layer 15B having a plurality of convolutional layers and a plurality of pooling layers
  • an output layer 15C In each layer, a plurality of "nodes" are connected by “edges”. The structure is
  • the CNN 15 of the present example is a learning model for performing segmentation for recognizing the position of a region of interest in an endoscope image.
  • a full-layer convolution network (FCN), which is a type of CNN, is applied to the CNN 15.
  • FCN full-layer convolution network
  • the position of the attention area in the endoscope image can be grasped at the pixel level.
  • the image set Sa for recognition (FIG. 3) is input to the input layer 15A.
  • the intermediate layer 15B is a part for extracting features from the image set Sa input from the input layer 15A.
  • the convolutional layer in the intermediate layer 15B performs a filtering process on a nearby node in the image set Sa or the previous layer (performs a convolution operation using a filter) to obtain a “feature map”.
  • the pooling layer reduces (or enlarges) the feature map output from the convolutional layer to create a new feature map.
  • the “convolution layer” has a role of extracting features such as edge extraction from an image, and the “pooling layer” has a role of providing robustness so that the extracted features are not affected by translation or the like.
  • the intermediate layer 15B is not limited to the case where the convolutional layer and the pooling layer are set as one set, but may include a case where the convolutional layer is continuous or a normalization layer.
  • the output layer 15C is a part that outputs a recognition result for detecting the position of the attention area in the endoscope image and classifying (discriminating) the type of lesion based on the features extracted by the intermediate layer 15B.
  • the CNN 15 is learned from a large number of sets of a learning image set Sa and correct data for the image set Sa.
  • the coefficients and offset values of the filter applied to each convolutional layer of the CNN 15 are determined by learning. Is set to the optimal value by the data set for
  • the correct answer data is preferably a region of interest or a discrimination result specified by a doctor with respect to an endoscopic image (in this example, at least one image of the image set Sa).
  • FIG. 5 is a schematic diagram showing a configuration example of the intermediate layer 15B of the CNN 15 shown in FIG.
  • the convolution layer of the first (1st) of the region of interest, and the image set Sa for recognition, the convolution operation of the filter F 1 is performed.
  • the image set Sa is N (N-channel) images having an image size of H vertically and W horizontally.
  • the image set Sa is an image of 9 channels.
  • Filter F 1 of this are image set Sa and convolution operation, since the image set S is N-channel (N sheets), for example, in the case of the filter size 5, the filter size will filter 5 ⁇ 5 ⁇ N.
  • Filter F 1 used in the second convolution layer for example, in the case of size 3 filter, the filter size will filter 3 ⁇ 3 ⁇ M.
  • the size of the “feature map” in the n-th convolutional layer is smaller than the size of the “feature map” in the second convolutional layer because the downscaling is performed by the convolutional layers up to the previous stage.
  • the convolutional layer in the first half of the intermediate layer 15B is responsible for extraction of feature values, and the convolutional layer in the second half is responsible for segmentation of the object (region of interest).
  • the convolution layer in the latter half upscaling is performed, and in the last convolution layer, one “feature map” having the same size as the input image set Sa is obtained.
  • the output layer 15C (FIG. 4) of the CNN 15 grasps the position of the attention area in the image of the image set Sa at the pixel level by using the “feature map” obtained from the intermediate layer 15B. That is, it is possible to detect whether each pixel of the endoscope image belongs to the attention area and output the detection result.
  • a plurality of images (WL) sequentially acquired in the multi-frame shooting mode are compared with a case where recognition is performed using any one (one type) of the WL image, the BLI image, and the LCI image.
  • Image, a BLI image, and an LCI image the recognition accuracy can be improved.
  • the CNN 15 of the present example recognizes the position of the attention area in the endoscope image, but the recognizer (CNN) according to the present invention is not limited to this, and performs discrimination regarding a lesion. Output the discrimination result.
  • the recognizer classifies the endoscope image into three categories of “neoplastic”, “non-neoplastic”, and “other”, and discriminates the result as “neoplastic”, “non-neoplastic”, and “other” May be output as three scores (the total of the three scores is 100%), or if the three scores can be clearly classified, a classification result may be output.
  • a CNN having a fully connected layer as the last one or more layers of the intermediate layer is preferable instead of the full-layer convolutional network (FCN).
  • FIG. 6 is a block diagram showing a main configuration used for explaining the operation of the endoscope system 10 according to the present invention.
  • observation light (V light, B light, G light, and R light) having different peak wavelengths are respectively transmitted to the light guide 40.
  • the subject 20 is irradiated via the. Since the V light, the B light, the G light, and the R light respectively reach a plurality of layers at different depths of the subject 20, images of the subject 20 at different depths can be captured by these observation lights.
  • a plurality of different observation lights for example, first observation light for WL, second observation light for BLI, and third observation light for LCI
  • An image, a BLI image, and an LCI image are sequentially acquired, but the observation light for WL, BLI, and LCI has different light intensity ratios of V light, B light, G light, and R light as described above. It is.
  • a WL image, a BLI image, and an LCI image are sequentially and repeatedly captured by irradiation of a plurality of different observation lights. Since the WL image, the BLI image, and the LCI image are each a color image, the endoscope processor 13 generates an RGB three-channel WL image, a BLI image, and an LCI image.
  • the recognizer 15 receives an image set Sa (images of 9 channels in total) including a WL image, a BLI image, and an LCI image as images for recognition.
  • the recognizer 15 detects the position of the region of interest (in this example, the lesion region) in the endoscope image, and outputs position information (recognition result) indicating the lesion region to the endoscope processor 13.
  • the image processing unit 65 of the endoscope processor 13 generates a WL image, a BLI image, and an LCI image from an image signal input from the endoscope 11 and also generates an observation image.
  • a part of a plurality of images for example, a WL image among a WL image, a BLI image, and an LCI image
  • an image calculated using the plurality of images (the WL image)
  • An image obtained by combining two or more images of the BLI image and the LCI image) may be used as the observation image.
  • the observation images are preferably one type of image.
  • the display control unit 66 inputs the observation image from the image processing unit 65, inputs the position information indicating the lesion area from the recognizing device 15, and causes the display device 14 to display the observation image and the recognition result.
  • the display control unit 66 displays the observation image 26 on the display device 14 and performs an emphasis process for emphasizing the recognized attention area (lesion area).
  • the lesion area is highlighted by superimposing and displaying the index 28 indicating the lesion area on the observation image 26 displayed on the display unit 14.
  • the display of the index 28 in addition to highlighting such as changing the color of the lesion area, display of a boundary line indicating the outline of the lesion area, display of a marker indicating the lesion area, and display of a bounding box can be considered.
  • the recognizer 15 of the present example recognizes the position of the attention area in the endoscope image.
  • the present invention is not limited to this, and the recognizer 15 may execute a discrimination regarding a lesion and output a discrimination result.
  • a display method of the discrimination result for example, a method of displaying text indicating the discrimination result on the image of the display unit 14 is conceivable.
  • the display position of the text need not be on the image, and may be a window different from the image as long as the correspondence between the text and the image is understood, and is not particularly limited.
  • the monochrome image sensor converts the light into R light, G light, B light, and V light.
  • the R, G, B, and V images of the corresponding colors are picked up in a frame-sequential manner.
  • FIG. 7 is a diagram showing an example of an image set including an R image, a G image, a B image, and a V image, which are imaged in a frame sequential manner.
  • the endoscope processor 13 includes a plurality of images (R image, G image, B image, and V image) sequentially acquired using a plurality of different observation lights (R light, G light, B light, and V light).
  • An observation image such as a WL image, a BLI image, and an LCI image can be generated. These observation images can be generated by adjusting the synthesis ratio of the R, G, B, and V images.
  • An image set may include an image obtained by multiplying at least two images among the R image, the G image, the B image, and the V image by a preset coefficient and synthesizing (four arithmetic operations). For example, an image obtained by dividing each pixel by an image (V image) having a center wavelength of 410 nm and an image (B image) having a center wavelength of 450 nm, or an image (V image) having a center wavelength of 410 nm and an image (B image) having a center wavelength of 450 nm An image obtained by multiplying each pixel by (image) may be used.
  • the recognizer 15 can receive the WL image, the BLI image, and the LCI image generated by the endoscope processor 13 as an image set Sb, and return a recognition result for the endoscope image to the endoscope processor 13.
  • the recognizer 15 of this example accepts an image set Sa (images of 9 channels in total) including a WL image, a BLI image, and an LCI image as an image for recognition, but is not limited thereto.
  • An image set including a G image, a B image, and a V image may be accepted, and a recognition result for an endoscopic image may be output.
  • FIG. 8 is a flowchart showing an embodiment of the image processing method according to the present invention, and shows a processing procedure of each unit of the endoscope system 10 shown in FIG.
  • the multi-frame shooting mode is set, and the endoscope 11 sequentially captures multi-frame images using a plurality of different observation lights (step S10).
  • the endoscope processor 13 acquires an image set constituting a multi-frame image captured by the endoscope 11 (step S12, first step).
  • the image set includes a WL image, a BLI image, and an LCI image captured by observation light for WL, BLI, and LCI captured by the endoscope 11, or an R image and a G image captured in a frame-sequential manner.
  • B image, and WL image, BLI image, and LCI image generated from the V image are conceivable, but the special light image may be any one of the BLI image and the LCI image, May be a special light image captured by the special light.
  • the image set does not include the WL image (normal light image) but includes two or more special light images including the first special light image captured by the first special light and the second special light image captured by the second special light. It may be an optical image. In short, any image set may be used as long as it is an image set including a plurality of images sequentially acquired using a plurality of different observation lights, and may be any image set.
  • the image processing unit 65 of the endoscope processor 13 generates an observation image based on the acquired image set (Step S14).
  • the observation image is an image calculated using a part of a plurality of images (for example, a WL image among a WL image, a BLI image, and an LCI image) or a plurality of images.
  • the recognizer 15 performs the position detection of the attention area shown in the endoscope image, the discrimination of the type of the lesion, and the like based on the image set received via the endoscope processor 13 and outputs the recognition result. (Step S16, second step).
  • the display controller 66 causes the display 14 to display the generated observation image and the recognition result obtained by the recognizer 15 (step S18, third step).
  • step S20 it is determined whether or not the imaging of the multi-frame image is to be ended. If the imaging of the multi-frame image is to be continued (in the case of “No”), the process transits to step S10 and proceeds to step S10. To step S20 are repeatedly performed. Thereby, the observation image is displayed as a moving image, and the recognition result of the recognizer 15 is also continuously displayed.
  • the endoscope system 10 including the endoscope scope 11 and the like has been described.
  • the present invention is not limited to the endoscope system 10, but includes the endoscope processor 13 and the recognizer 15.
  • An image processing device may be used.
  • the endoscope processor 13 and the recognizer 15 may be integrated or may be separate.
  • the different observation lights are not limited to those emitted from the four-color LEDs.
  • a blue laser diode that emits blue laser light having a center wavelength of 445 nm and a blue-violet laser diode that emits blue-violet laser light having a center wavelength of 405 nm May be used as a light source, and the laser light of the blue laser diode and the blue-violet laser diode may be emitted to a YAG (Yttrium Aluminum Aluminum Garnet) based phosphor to emit light.
  • YAG Yttrium Aluminum Aluminum Garnet
  • the blue-violet laser light is transmitted without exciting the phosphor. Therefore, by adjusting the intensity of the blue laser light and the blue-violet laser light, it is possible to irradiate the observation light for WL, the observation light for BLI, and the observation light for LCI. When only light is emitted, observation light having a center wavelength of 405 nm can be emitted.
  • the observation image according to the present invention is not limited to a moving image, but may be a still image stored in the storage unit 67 or the like, and the recognizer may output a recognition result based on a still image set.
  • the recognizer is not limited to the CNN, but may be a machine learning model other than the CNN, such as a DBN (Deep Belief Network) and an SVM (Support Vector Machine).
  • DBN Deep Belief Network
  • SVM Serial Vector Machine
  • the hardware structure of the endoscope processor 13 and / or the recognizer 15 is various processors as described below.
  • the circuit configuration can be changed after manufacturing such as CPU (Central Processing Unit) and FPGA (Field Programmable Gate Array), which are general-purpose processors that execute software (programs) and function as various control units.
  • Special-purpose circuits such as a programmable logic device (Programmable Logic Device: PLD), an ASIC (Application Specific Integrated Circuit), and a dedicated electric circuit having a circuit configuration specifically designed to execute a specific process are included. It is.
  • One processing unit may be configured by one of these various processors, or configured by two or more processors of the same or different types (for example, a plurality of FPGAs or a combination of a CPU and an FPGA). You may. Further, a plurality of control units may be configured by one processor. As an example of configuring a plurality of control units with one processor, first, as represented by a computer such as a client or a server, one processor is configured by a combination of one or more CPUs and software. There is a form in which a processor functions as a plurality of control units.
  • SoC system-on-chip
  • a form in which a processor that realizes the functions of the entire system including a plurality of control units by one IC (Integrated Circuit) chip is used. is there.
  • the various control units are configured by using one or more of the various processors described above as a hardware structure.

Abstract

Provided are an image processing device, method, and an endoscopic system with which it is possible to improve recognition accuracy on the basis of a plurality of images and with which it is possible to present preferable observation images and recognition results. The present invention is provided with: a recognizer (15) which receives an image set comprising a plurality of images acquired sequentially using a plurality of different observation beams and which outputs a recognition result with respect to the image set; and a display control unit (66), of an endoscope processor (13), for causing both a recognition result and an observation image calculated using all or a portion of a plurality of images, to be displayed on a display unit (14). Since the recognizer (15) receives a set of images acquired sequentially using a plurality of different observation beams and acquires a recognition result with respect to the image set, it is possible to improve recognition accuracy as compared with the case where image recognition is performed on the basis of a single image acquired using a single observation beam.

Description

画像処理装置、方法及び内視鏡システムImage processing apparatus, method, and endoscope system
 本発明は画像処理装置、方法及び内視鏡システムに係り、特に内視鏡検査において医師の支援に利用可能な技術に関する。 The present invention relates to an image processing apparatus, an image processing method, and an endoscope system, and more particularly, to a technology that can be used for assisting a doctor in endoscopy.
 医療分野においては、内視鏡装置を用いた検査が行われている。近年においては、画像解析によって内視鏡画像に含まれる病変の位置や病変の種類の認識を行い、認識結果を報知することで検査を支援することが知られている。 検 査 In the medical field, examinations using endoscope devices are performed. In recent years, it has been known that the position of a lesion and the type of lesion included in an endoscopic image are recognized by image analysis, and an examination is supported by reporting the recognition result.
 認識のための画像解析においては、深層学習(Deep Learning)をはじめとする画像の機械学習が広く使用されている。 In image analysis for recognition, machine learning of images, such as deep learning, is widely used.
 特許文献1には、時系列に沿って撮影された細胞の複数の画像を取得する取得部と、取得された複数の画像に対して、所定の1以上の評価項目の各々に関して、時系列に沿った評価値を付与する付与部と、付与された時系列に沿った評価値の時間変化に基づいて、細胞を評価する評価部とを具備する情報処理装置が提案されている。ここで、評価部は、時系列に沿って撮影された細胞の複数の画像に対して、機械学習アルゴリズムにしたがって時系列に沿った評価値を付与し、付与した評価値の時間変化に基づいて観察対象となる細胞を評価している。これにより、細胞の時系列的な評価を総合的に鑑みた評価を可能にしている。 Patent Literature 1 discloses an acquisition unit that acquires a plurality of images of cells photographed in a time series, and acquires a plurality of images in a time series with respect to each of one or more predetermined evaluation items. There has been proposed an information processing apparatus including an assigning unit that assigns an evaluation value along the evaluation unit and an evaluation unit that evaluates a cell based on a time change of the evaluation value along the assigned time series. Here, the evaluation unit assigns an evaluation value along the time series to a plurality of images of the cells photographed along the time series according to a machine learning algorithm, and based on a time change of the assigned evaluation value. The cells to be observed are evaluated. This enables an evaluation that comprehensively considers the time-series evaluation of cells.
 また、特許文献2には、動画中のデータ列である時系列入力データを取得し、時系列入力データにおける一の時点の入力データに対応する複数の入力値を、時系列入力データに対応する学習済みのモデル(ボルツマンマシンを構成するモデル)が有する複数のノードに供給し、時系列入力データにおける予測対象時点より前の入力データ系列と、モデルにおける入力データ系列中の入力データに対応する複数の入力値のそれぞれと複数のノードのそれぞれとの間の重みパラメータとに基づいて、入力データ系列が発生した条件下において予測対象時点に対応する各入力値となる条件付確率を算出し、予測対象時点に対応する各入力値の条件付確率に基づいて、時系列入力データが発生した条件の下で次の入力データが予め定められた値となる条件付確率を算出する処理装置が提案されている。 Further, in Patent Document 2, time-series input data which is a data sequence in a moving image is acquired, and a plurality of input values corresponding to input data at one time point in the time-series input data correspond to the time-series input data. It is supplied to a plurality of nodes of a trained model (a model constituting a Boltzmann machine), and a plurality of nodes corresponding to the input data series before the prediction target time in the time-series input data and the input data in the input data series in the model Calculating a conditional probability to be each input value corresponding to a prediction target time point under a condition in which an input data sequence occurs, based on a weight parameter between each of the input values and each of the plurality of nodes; Based on the conditional probability of each input value corresponding to the target time point, the next input data becomes a predetermined value under the condition that the time-series input data occurs. Processing apparatus has been proposed to calculate the conditional probability.
 この処理装置は、一例として時系列に並ぶT-1個の画像データに基づき、次の時刻に配列される1つの画像データを予測して、合計T個の画像を含む動画を生成することができる。 As an example, the processing apparatus can generate a moving image including a total of T images by predicting one image data arrayed at the next time based on T-1 image data arranged in time series. it can.
特開2018-22216号公報JP 2018-22216 A 特開2016-71697号公報JP 2016-71697 A
 特許文献1に記載の情報処理装置は、時系列に沿って撮像された複数の画像により、撮像対象である細胞(受精卵)の培養過程における種々の変化等を評価するものであり、複数の画像は、同じ撮像条件で撮像されたものである。同じ撮像条件で撮像しなければ、取得した複数の画像から受精卵の変化を評価することができないからである。即ち、複数の画像は、異なる観察光を用いて順次取得された画像ではない。 The information processing device described in Patent Literature 1 evaluates various changes and the like in a culture process of cells (fertilized eggs) to be imaged, based on a plurality of images imaged in time series. The images are captured under the same imaging conditions. This is because, unless the images are taken under the same imaging conditions, it is not possible to evaluate the change in the fertilized egg from the plurality of acquired images. That is, the plurality of images are not images sequentially acquired using different observation lights.
 特許文献2に記載の処理装置は、時系列入力データを入力する学習済みモデルにより、次の時刻の画像データの予測を可能にするものであり、時系列入力データは、同じ撮像条件で撮像されたものである。同じ撮像条件で撮像しなければ、入力する時系列入力データから次の時刻の画像データを予測することができないからである。即ち、時系列入力データは、異なる観察光を用いて順次取得された入力データではない。 The processing device described in Patent Literature 2 enables prediction of image data at the next time by using a trained model that inputs time-series input data. The time-series input data is captured under the same imaging conditions. It is a thing. This is because if the image is not captured under the same imaging condition, the image data at the next time cannot be predicted from the input time-series input data. That is, the time-series input data is not input data sequentially acquired using different observation lights.
 また、特許文献1、2に記載の発明は、いずれも経時変化する対象(細胞、将来の動画)を予測するために時系列の複数の画像を入力しており、認識器での認識精度を向上させる目的で複数の画像を入力するものではない。 In addition, the inventions described in Patent Documents 1 and 2 each input a plurality of time-series images in order to predict objects (cells, future moving images) that change over time, and improve recognition accuracy in a recognizer. It does not input multiple images for the purpose of improvement.
 本発明はこのような事情に鑑みてなされたもので、複数の画像を元に認識精度の向上を図ることができ、かつ良好な観察用画像及び認識結果を提示することができる画像処理装置、方法及び内視鏡システムを提供することを目的とする。 The present invention has been made in view of such circumstances, and can improve the recognition accuracy based on a plurality of images, and an image processing apparatus that can present a good observation image and a recognition result, It is an object to provide a method and an endoscope system.
 上記目的を達成するために本発明の一の態様に係る画像処理装置は、複数の異なる観察光を用いて順次取得された複数の画像からなる画像セットを受け付け、画像セットに対する認識結果を出力する認識器と、複数の画像の一部もしくは複数の画像を用いて算出された観察用画像及び認識結果を表示部に表示させる表示制御部と、を備える。 In order to achieve the above object, an image processing apparatus according to an aspect of the present invention receives an image set including a plurality of images sequentially acquired using a plurality of different observation lights and outputs a recognition result for the image set. The image processing apparatus includes a recognizer, and a display control unit that causes the display unit to display the observation image and the recognition result calculated using a part of the plurality of images or the plurality of images.
 本発明の一の態様によれば、複数の異なる観察光を用いて順次取得された画像セットを入力し、その画像セットに対する認識結果を取得するため、1つの観察光により取得された1つの画像を元に、その画像の認識を行う場合に比べて認識精度を向上させることができる。また、複数の画像から得られる観察用画像とともに、認識結果を表示部に表示させることで、認識結果を適切に提示することができる。 According to an aspect of the present invention, an image set sequentially acquired using a plurality of different observation lights is input, and a recognition result for the image set is acquired, so that one image acquired by one observation light is acquired. , The recognition accuracy can be improved as compared with the case where the image is recognized. In addition, by displaying the recognition result on the display unit together with the observation images obtained from the plurality of images, the recognition result can be appropriately presented.
 本発明の他の態様に係る画像処理装置において、認識器は、学習用の複数の画像と正解データとをセットにして学習した学習済みモデルを有し、認識用の複数の画像を受け付ける毎に学習済みモデルに基づいて認識結果を出力することが好ましい。 In the image processing device according to another aspect of the present invention, the recognizer has a trained model learned by setting a plurality of images for learning and correct data, and each time a plurality of images for recognition are received. It is preferable to output a recognition result based on the learned model.
 本発明の更に他の態様に係る画像処理装置において、学習済みモデルは、畳み込みニューラルネットワークで構成されることが好ましい。畳み込みニューラルネットワークは、画像に対する認識に優れている。 に お い て In the image processing device according to still another aspect of the present invention, it is preferable that the learned model is configured by a convolutional neural network. The convolutional neural network is excellent in recognizing images.
 本発明の更に他の態様に係る画像処理装置において、複数の画像は、第1内視鏡画像及び第1内視鏡画像とは異なる観察光を用いて取得された第2内視鏡画像を含むことが好ましい。内視鏡検査では、複数の異なる観察光を用いて複数の画像を取得する場合があり、この場合の内視鏡検査に適用することができる。 In the image processing device according to still another aspect of the present invention, the plurality of images include a first endoscope image and a second endoscope image acquired using observation light different from the first endoscope image. It is preferred to include. In the endoscopy, a plurality of images may be acquired using a plurality of different observation lights, and the present invention can be applied to the endoscopy in this case.
 本発明の更に他の態様に係る画像処理装置において、第1内視鏡画像は、通常光で撮像された通常光画像であり、第2内視鏡画像は、特殊光で撮像された特殊光画像であることが好ましい。一般に、通常光画像は観察用画像として使用され、特殊光画像は表面構造を観察したい場合に使用される。 In the image processing device according to still another aspect of the present invention, the first endoscope image is a normal light image captured with normal light, and the second endoscope image is a special light image captured with special light. It is preferably an image. Generally, a normal light image is used as an image for observation, and a special light image is used when it is desired to observe a surface structure.
 本発明の更に他の態様に係る画像処理装置において、特殊光画像は、2以上の異なる特殊光により撮像された2以上の特殊光画像を含む。観察したい表面構造の深さが異なる場合等の観察目的に応じて、2以上の特殊光画像が撮像され得る。 In the image processing apparatus according to still another aspect of the present invention, the special light image includes two or more special light images captured by two or more different special lights. Two or more special light images can be captured according to the observation purpose such as when the depth of the surface structure to be observed is different.
 本発明の更に他の態様に係る画像処理装置において、第1内視鏡画像は、第1特殊光で撮像された第1特殊光画像であり、第2内視鏡画像は、第1特殊光とは異なる第2特殊光で撮像された第2特殊光画像である。即ち、複数の内視鏡画像には、通常光画像が含まれない場合もある。 In the image processing device according to still another aspect of the present invention, the first endoscope image is a first special light image captured with the first special light, and the second endoscope image is a first special light image. 3 is a second special light image captured by a second special light different from the second special light. That is, the plurality of endoscope images may not include the normal light image.
 本発明の更に他の態様に係る画像処理装置において、表示制御部は、複数の画像の一部もしくは複数の画像を用いて算出された観察用画像を、動画として表示部に表示させることが好ましい。これにより、動画として表示される観察用画像及び認識結果を見ながらリアルタイムに検査が可能になる。 In the image processing device according to still another aspect of the present invention, it is preferable that the display control unit causes the display unit to display an observation image calculated using a part of the plurality of images or the plurality of images as a moving image. . Thus, it is possible to perform an inspection in real time while viewing the observation image and the recognition result displayed as a moving image.
 本発明の更に他の態様に係る画像処理装置において、認識器は、複数の画像に含まれる注目領域を認識し、表示制御部は、認識された注目領域を示す指標を、表示部に表示された画像上に重畳して表示させることが好ましい。これにより、観察用画像内の注目領域の見落としがないように検査の支援を行うことができる。 In the image processing device according to still another aspect of the present invention, the recognizer recognizes a region of interest included in the plurality of images, and the display control unit displays an index indicating the recognized region of interest on the display unit. It is preferable to superimpose and display the image on the image. This makes it possible to support the inspection so that the attention area in the observation image is not overlooked.
 本発明の更に他の態様に係る画像処理装置において、認識器は、複数の画像に含まれる注目領域を認識し、表示制御部は、注目領域の有無を表す情報を表示部に表示された画像と重ならないように表示させることが好ましい。これにより、観察用画像内に注目領域が存在することを報知することができ、また、表示部に表示される情報により観察画像の観察が阻害されないようにすることができる。 In the image processing device according to still another aspect of the present invention, the recognizer recognizes a region of interest included in the plurality of images, and the display control unit displays information indicating presence / absence of the region of interest on the display unit. It is preferable that the display is made so as not to overlap with. Accordingly, it is possible to notify that the attention area exists in the observation image, and to prevent the observation of the observation image from being disturbed by the information displayed on the display unit.
 本発明の更に他の態様に係る画像処理装置において、認識器は、複数の画像に基づいて病変に関する鑑別を実行して鑑別結果を出力し、表示制御部は、鑑別結果を表示部に表示させることが好ましい。これにより、認識器による鑑別結果を参考にしながら観察用画像の目視による検査が可能になる。 In the image processing device according to still another aspect of the present invention, the recognizer executes the discrimination regarding the lesion based on the plurality of images and outputs the discrimination result, and the display control unit displays the discrimination result on the display unit. Is preferred. This makes it possible to visually inspect the observation image while referring to the discrimination result obtained by the recognizer.
 本発明の更に他の態様に係る内視鏡システムは、第1観察光と第1観察光と異なる第2観察光とを順次発生する光源装置と、第1観察光と第2観察光により順次照明された観察対象を順次撮像することにより複数の画像を撮像する内視鏡スコープと、表示部と、上記の画像処理装置と、を備え、認識器は、内視鏡スコープが撮像する複数の画像からなる画像セットを受け付ける。 An endoscope system according to yet another aspect of the present invention includes a light source device that sequentially generates a first observation light and a second observation light different from the first observation light, and a first observation light and a second observation light sequentially. An endoscope scope that captures a plurality of images by sequentially capturing the illuminated observation target, a display unit, and the image processing device described above, and a recognizer includes a plurality of image capture units that the endoscope scope captures. An image set including images is received.
 本発明の更に他の態様に係る内視鏡システムにおいて、内視鏡スコープが撮像する複数の画像を受け付け、複数の画像の画像処理を行う内視鏡プロセッサを備え、認識器は、内視鏡プロセッサによる画像処理後の複数の画像を受け付けることが好ましい。内視鏡プロセッサは、内視鏡スコープが撮像する複数の画像を画像処理する機能を備えており、認識器は、画像処理後の複数の画像を用いて病変領域の検出・鑑別を行うことができる。尚、認識器は、内視鏡プロセッサとは別体のものでもよいし、内視鏡プロセッサに内蔵されたものでもよい。 An endoscope system according to still another aspect of the present invention includes an endoscope processor that receives a plurality of images captured by an endoscope and performs image processing on the plurality of images. It is preferable to receive a plurality of images after image processing by the processor. The endoscope processor has a function of performing image processing on a plurality of images captured by the endoscope scope, and the recognizer can detect and discriminate a lesion area using the plurality of images after the image processing. it can. The recognizer may be separate from the endoscope processor, or may be built in the endoscope processor.
 本発明の更に他の態様に係る画像処理方法は、複数の異なる観察光を用いて取得された複数の画像からなる画像セットを受け付ける第1ステップと、認識器が、画像セットに対する認識結果を出力する第2ステップと、表示制御部が、複数の画像の一部もしくは複数の画像を用いて算出された観察用画像及び認識結果を表示部に表示させる第3ステップと、を含み、第1ステップから第3ステップの処理を繰り返し実行する。 An image processing method according to still another aspect of the present invention includes a first step of receiving an image set including a plurality of images acquired using a plurality of different observation lights, and a recognizer outputting a recognition result for the image set. And a third step in which the display control unit causes the display unit to display the observation image and the recognition result calculated using a part of the plurality of images or the plurality of images. To the third step are repeatedly executed.
 本発明の更に他の態様に係る画像処理方法において、第2ステップは、学習用の画像セットと正解データとにより学習した学習済みモデルを有する認識器が、認識用の画像セットを受け付ける毎に学習済みモデルに基づいて認識結果を出力することが好ましい。 In the image processing method according to still another aspect of the present invention, in the second step, the recognizing device having the trained model learned from the learning image set and the correct data receives the learning each time the recognition image set is received. It is preferable to output a recognition result based on the completed model.
 本発明の更に他の態様に係る画像処理方法において、学習済みモデルは、畳み込みニューラルネットワークで構成されることが好ましい。 に お い て In the image processing method according to still another aspect of the present invention, it is preferable that the learned model is configured by a convolutional neural network.
 本発明の更に他の態様に係る画像処理方法において、複数の画像は、第1内視鏡画像及び第1内視鏡画像とは異なる観察光を用いて取得された第2内視鏡画像を含むことが好ましい。 In the image processing method according to still another aspect of the present invention, the plurality of images include a first endoscope image and a second endoscope image acquired using observation light different from the first endoscope image. It is preferred to include.
 本発明の更に他の態様に係る画像処理方法において、第1内視鏡画像は、通常光で撮像された通常光画像であり、第2内視鏡画像は、特殊光で撮像された特殊光画像であることが好ましい。 In the image processing method according to still another aspect of the present invention, the first endoscope image is a normal light image captured with normal light, and the second endoscope image is a special light image captured with special light. It is preferably an image.
 本発明によれば、複数の異なる観察光を用いて順次取得された複数の画像からなる画像セットを元に、画像セットに対する認識を行うため、認識精度を向上させることができる。また、複数の画像から得られる観察用画像ととともに、認識結果を表示部に表示させることで、認識結果を適切に提示することができる。 According to the present invention, recognition of an image set is performed based on an image set including a plurality of images sequentially acquired using a plurality of different observation lights, so that recognition accuracy can be improved. In addition, by displaying the recognition result on the display unit together with the observation images obtained from the plurality of images, the recognition result can be appropriately presented.
図1は、本発明に係る内視鏡システム10の外観を示す斜視図である。FIG. 1 is a perspective view showing an appearance of an endoscope system 10 according to the present invention. 図2は、内視鏡システム10の電気的構成を示すブロック図である。FIG. 2 is a block diagram illustrating an electrical configuration of the endoscope system 10. 図3は、主としてマルチフレーム撮影モードで撮像されるマルチフレーム画像と画像セットの一例を示す図である。FIG. 3 is a diagram illustrating an example of a multi-frame image and an image set mainly captured in the multi-frame shooting mode. 図4は、認識器15を構成する学習モデルの一つである畳み込みニューラルネットワークの代表的な構成例を示す模式図である。FIG. 4 is a schematic diagram showing a typical configuration example of a convolutional neural network which is one of the learning models constituting the recognizer 15. 図5は、図4に示したCNN15の中間層15Bの構成例を示す模式図である。FIG. 5 is a schematic diagram showing a configuration example of the intermediate layer 15B of the CNN 15 shown in FIG. 図6は、本発明に係る内視鏡システム10の作用を説明するために用いた主要構成を示すブロック図である。FIG. 6 is a block diagram showing a main configuration used for explaining the operation of the endoscope system 10 according to the present invention. 図7は、面順次で撮像されるR画像、G画像、B画像、及びV画像と画像セットの一例を示す図である。FIG. 7 is a diagram illustrating an example of an image set including an R image, a G image, a B image, and a V image captured in a frame sequential manner. 図8は、本発明に係る画像処理方法の実施形態を示すフローチャートである。FIG. 8 is a flowchart showing an embodiment of the image processing method according to the present invention.
 以下、添付図面に従って本発明に係る画像処理装置、方法及び内視鏡システムの好ましい実施形態について説明する。 Hereinafter, preferred embodiments of an image processing apparatus, an image processing method, and an endoscope system according to the present invention will be described with reference to the accompanying drawings.
 [内視鏡システムの全体構成]
 図1は、本発明に係る内視鏡システム10の外観を示す斜視図である。
[Overall configuration of endoscope system]
FIG. 1 is a perspective view showing an appearance of an endoscope system 10 according to the present invention.
 図1に示すように内視鏡システム10は、主として被検体内の観察対象を撮像する内視鏡スコープ(ここでは軟性内視鏡)11と、光源装置12と、内視鏡プロセッサ13と、液晶モニタ等の表示部(表示器)14と、認識器15とから構成されている。 As shown in FIG. 1, an endoscope system 10 mainly includes an endoscope (here, a flexible endoscope) 11 for imaging an observation target in a subject, a light source device 12, an endoscope processor 13, It comprises a display unit (display) 14 such as a liquid crystal monitor and a recognizer 15.
 光源装置12は、通常光画像の撮像用の白色光、特殊光画像の撮像用の特定の波長帯域の光等の各種の観察光を内視鏡スコープ11へ供給する。 The light source device 12 supplies the endoscope 11 with various kinds of observation light such as white light for capturing a normal light image and light in a specific wavelength band for capturing a special light image.
 内視鏡プロセッサ13は、内視鏡スコープ11により得られた画像信号に基づいて表示用/記録用の通常光画像、特殊光画像、又は観察用画像の画像データを生成する画像処理機能、光源装置12を制御する機能、通常画像又は観察用画像、及び認識器15による認識結果を表示器14に表示させる機能等を有する。尚、認識器15の詳細は後述するが、内視鏡プロセッサ13を内視鏡画像を受け付け、内視鏡画像に対する注目領域(病変、手術痕、処置痕、処置具など)の位置検出や病変の種類の鑑別等の認識を行う部分である。 The endoscope processor 13 has an image processing function of generating image data of a normal light image for display / recording, a special light image, or an image for observation based on an image signal obtained by the endoscope 11, and a light source. It has a function of controlling the device 12, a function of displaying a normal image or an image for observation, and a function of displaying a recognition result by the recognizer 15 on the display 14. Although the details of the recognizer 15 will be described later, the endoscope processor 13 accepts an endoscopic image, detects the position of a region of interest (lesion, surgical scar, treatment scar, treatment tool, etc.) with respect to the endoscopic image, and detects the lesion. It is a part that performs recognition such as discrimination of the type.
 表示器14は、内視鏡プロセッサ13から入力される表示用の画像データに基づき通常画像、特殊光画像又は観察用画像、及び認識器15による認識結果を表示する。 The display 14 displays a normal image, a special light image or an image for observation, and a recognition result by the recognizer 15 based on display image data input from the endoscope processor 13.
 内視鏡スコープ11は、被検体内に挿入される可撓性の挿入部16と、挿入部16の基端部に連設され、内視鏡スコープ11の把持及び挿入部16の操作に用いられる手元操作部17と、手元操作部17を光源装置12及び内視鏡プロセッサ13に接続するユニバーサルコード18と、を備えている。 The endoscope 11 is connected to a flexible insertion portion 16 to be inserted into a subject and a base end of the insertion portion 16, and is used for gripping the endoscope 11 and operating the insertion portion 16. And a universal cord 18 that connects the hand operation unit 17 to the light source device 12 and the endoscope processor 13.
 挿入部16の先端部である挿入部先端部16aには、照明レンズ42、対物レンズ44、撮像素子45などが内蔵されている(図2参照)。挿入部先端部16aの後端には、湾曲自在な湾曲部16bが連設されている。また、湾曲部16bの後端には、可撓性を有する可撓管部16cが連設されている。 (4) The illumination lens 42, the objective lens 44, the imaging element 45, and the like are built in the insertion portion tip 16a, which is the tip of the insertion portion 16 (see FIG. 2). A bending portion 16b that can freely bend is connected to the rear end of the insertion portion front end portion 16a. Further, a flexible tube portion 16c having flexibility is connected to the rear end of the curved portion 16b.
 手元操作部17には、アングルノブ21、操作ボタン22、及び鉗子入口23などが設けられている。アングルノブ21は、湾曲部16bの湾曲方向及び湾曲量を調整する際に回転操作される。操作ボタン22は、送気・送水や吸引等の各種の操作に用いられる。鉗子入口23は、挿入部16内の鉗子チャンネルに連通している。また、手元操作部17には、各種の設定を行う内視鏡操作部46(図2参照)等が設けられている。 The hand operation unit 17 is provided with an angle knob 21, an operation button 22, a forceps inlet 23, and the like. The angle knob 21 is rotated when adjusting the bending direction and the bending amount of the bending portion 16b. The operation button 22 is used for various operations such as air supply / water supply and suction. The forceps inlet 23 communicates with a forceps channel in the insertion section 16. Further, the hand operation unit 17 is provided with an endoscope operation unit 46 (see FIG. 2) for performing various settings.
 ユニバーサルコード18には、送気・送水チャンネル、信号ケーブル、及びライトガイドなどが組み込まれている。ユニバーサルコード18の先端部には、光源装置12に接続されるコネクタ部25aと、内視鏡プロセッサ13に接続されるコネクタ部25bとが設けられている。これにより、コネクタ部25aを介して光源装置12から内視鏡スコープ11に観察光が供給され、コネクタ部25bを介して内視鏡スコープ11により得られた画像信号が内視鏡プロセッサ13に入力される。 The universal cord 18 incorporates an air / water channel, a signal cable, a light guide, and the like. The distal end of the universal cord 18 is provided with a connector 25a connected to the light source device 12 and a connector 25b connected to the endoscope processor 13. Thereby, observation light is supplied from the light source device 12 to the endoscope scope 11 via the connector section 25a, and an image signal obtained by the endoscope scope 11 is input to the endoscope processor 13 via the connector section 25b. Is done.
 尚、光源装置12には、電源ボタン、光源を点灯させる点灯ボタン、及び明るさ調節ボタン等の光源操作部12aが設けられ、また、内視鏡プロセッサ13には、電源ボタン、図示しないマウス等のポインティングデバイスからの入力を受け付ける入力部を含むプロセッサ操作部13aが設けられている。本例の内視鏡プロセッサ13と光源装置12とは別体型のものであるが、内視鏡プロセッサは、光源装置内蔵型ものでもよい。 The light source device 12 is provided with a light source operation unit 12a such as a power button, a lighting button for turning on the light source, and a brightness adjustment button. The endoscope processor 13 includes a power button, a mouse (not shown), and the like. Is provided with a processor operation unit 13a including an input unit for receiving an input from the pointing device. Although the endoscope processor 13 and the light source device 12 of this example are of a separate type, the endoscope processor may be of a type with a built-in light source device.
 [内視鏡システムの電気的構成]
 図2は、内視鏡システム10の電気的構成を示すブロック図である。
[Electrical configuration of endoscope system]
FIG. 2 is a block diagram illustrating an electrical configuration of the endoscope system 10.
 図2に示すように内視鏡スコープ11は、大別してライトガイド40と、照明レンズ42と、対物レンズ44と、撮像素子45と、内視鏡操作部46と、内視鏡制御部47と、ROM(Read Only Memory)48とを有している。 As shown in FIG. 2, the endoscope scope 11 is roughly divided into a light guide 40, an illumination lens 42, an objective lens 44, an image sensor 45, an endoscope operation unit 46, and an endoscope control unit 47. , ROM (Read Only Memory) 48.
 ライトガイド40は、大口径光ファイバ、バンドルファイバなどが用いられる。ライトガイド40は、その入射端がコネクタ部25aを介して光源装置12に挿入されており、その出射端が挿入部16を通って挿入部先端部16a内に設けられた照明レンズ42に対向している。光源装置12からライトガイド40に供給された照明光は、照明レンズ42を通して観察対象に照射される。そして、観察対象で反射及び/又は散乱した照明光は、対物レンズ44に入射する。 (4) The light guide 40 uses a large-diameter optical fiber, a bundle fiber, or the like. The light guide 40 has an incident end inserted into the light source device 12 via the connector portion 25a, and an emission end thereof passes through the insertion portion 16 and faces an illumination lens 42 provided in the insertion portion distal end portion 16a. ing. The illumination light supplied from the light source device 12 to the light guide 40 is applied to the observation target through the illumination lens 42. Then, the illumination light reflected and / or scattered by the observation target enters the objective lens 44.
 対物レンズ44は、入射した照明光の反射光又は散乱光(即ち、観察対象の光学像)を撮像素子45の撮像面に結像させる。 (4) The objective lens 44 forms reflected light or scattered light (that is, an optical image of an observation target) of the incident illumination light on the imaging surface of the imaging element 45.
 撮像素子45は、CMOS(complementary metal oxide semiconductor)型又はCCD(charge coupled device)型の撮像素子であり、対物レンズ44よりも奥側の位置で対物レンズ44に相対的に位置決め固定されている。撮像素子45の撮像面には、光学像を光電変換する複数の光電変換素子(フォトダイオード)により構成される複数の画素が2次元配列されている。また、本例の撮像素子45の複数の画素の入射面側には、画素毎に赤(R)、緑(G)、青(B)のカラーフィルタが配置され、これによりR画素、G画素、B画素が構成されている。尚、RGBのカラーフィルタのフィルタ配列は、ベイヤ配列が一般的であるが、これに限らない。 The image sensor 45 is a complementary metal oxide semiconductor (CMOS) or charge coupled device (CCD) type image sensor, and is positioned and fixed relatively to the objective lens 44 at a position deeper than the objective lens 44. A plurality of pixels constituted by a plurality of photoelectric conversion elements (photodiodes) for photoelectrically converting an optical image are two-dimensionally arranged on an imaging surface of the imaging element 45. Further, red (R), green (G), and blue (B) color filters are arranged for each pixel on the incident surface side of the plurality of pixels of the image sensor 45 of the present example, whereby the R pixel and the G pixel are arranged. , B pixels. The filter array of the RGB color filters is generally a Bayer array, but is not limited to this.
 撮像素子45は、対物レンズ44により結像される光学像を電気的な画像信号に変換して内視鏡プロセッサ13に出力する。 The imaging element 45 converts the optical image formed by the objective lens 44 into an electric image signal and outputs it to the endoscope processor 13.
 尚、撮像素子45がCMOS型である場合には、A/D(Analog/Digital)変換器が内蔵されており、撮像素子45から内視鏡プロセッサ13に対してデジタルの画像信号が直接出力される。また、撮像素子45がCCD型である場合には、撮像素子45から出力される画像信号は、図示しないA/D変換器等でデジタルな画像信号に変換された後、内視鏡プロセッサ13に出力される。 When the image sensor 45 is a CMOS type, an A / D (Analog / Digital) converter is built in, and a digital image signal is directly output from the image sensor 45 to the endoscope processor 13. You. When the image sensor 45 is of a CCD type, an image signal output from the image sensor 45 is converted into a digital image signal by an A / D converter or the like (not shown). Is output.
 内視鏡操作部46は、図示しない静止画撮像ボタン、通常光画像撮影モード、特殊光画像撮影モード、及びマルチフレーム撮影モードのうちのいずれかの撮影モードを設定する撮影モード設定部を有している。尚、撮影モード設定部は、内視鏡プロセッサ13のプロセッサ操作部13aに設けられていてもよい。 The endoscope operation unit 46 has a shooting mode setting unit that sets any one of a still image shooting button (not shown), a normal light image shooting mode, a special light image shooting mode, and a multi-frame shooting mode. ing. Note that the photographing mode setting unit may be provided in the processor operation unit 13a of the endoscope processor 13.
 内視鏡制御部47は、内視鏡操作部46での操作に応じてROM48等から読み出した各種プログラムやデータを逐次実行し、主として撮像素子45の駆動を制御する。例えば、通常光画像撮影モードの場合、内視鏡制御部47は、撮像素子45のR画素、G画素及びB画素の信号を読み出すように撮像素子45を制御し、特殊光画像撮影モード又はマルチフレーム撮影モードであって、特定の特殊光画像を取得するために観察光としてV-LED32aから紫色光が発光される場合、又はB-LED32bから青色光が発光される場合には、これらの紫色光、青色色光の波長帯域に分光感度を有する撮像素子45のB画素の信号のみを読み出し、あるいはR画素、G画素及びB画素の3つの色画素のうちのいずれか1つの色画素、又は2つの色画素を読み出すように撮像素子45を制御する。 (4) The endoscope control unit 47 sequentially executes various programs and data read from the ROM 48 or the like in accordance with an operation on the endoscope operation unit 46, and mainly controls the driving of the imaging element 45. For example, in the case of the normal light image capturing mode, the endoscope control unit 47 controls the image sensor 45 to read out the signals of the R pixel, the G pixel, and the B pixel of the image sensor 45, and performs the special light image capturing mode or the multi In the frame photographing mode, when the V-LED 32a emits purple light as the observation light in order to acquire a specific special light image, or when the B-LED 32b emits blue light, these purple lights are emitted. Only the signal of the B pixel of the image sensor 45 having the spectral sensitivity in the wavelength band of light and blue light is read out, or one of the three color pixels of the R pixel, the G pixel, and the B pixel, or 2 The image sensor 45 is controlled to read out one color pixel.
 また、内視鏡制御部47は、内視鏡プロセッサ13のプロセッサ制御部61との間で通信を行い、内視鏡操作部46での操作情報及びROM48に記憶されている内視鏡スコープ11の種類を識別するための識別情報等を内視鏡プロセッサ13に送信する。 The endoscope control unit 47 communicates with the processor control unit 61 of the endoscope processor 13, and operates the endoscope scope 11 stored in the ROM 48 and the operation information of the endoscope operation unit 46. The identification information for identifying the type is transmitted to the endoscope processor 13.
 光源装置12は、光源制御部31及び光源ユニット32を有している。光源制御部31は、光源ユニット32の制御と、内視鏡プロセッサ13のプロセッサ制御部61との間で通信を行い、各種情報の遣り取りを行う。 The light source device 12 has a light source control unit 31 and a light source unit 32. The light source control unit 31 performs communication between the control of the light source unit 32 and the processor control unit 61 of the endoscope processor 13 to exchange various information.
 光源ユニット32は、例えば複数の半導体光源を有している。本実施形態では、光源ユニット32は、V-LED(Violet Light Emitting Diode)32a、B-LED(Blue Light Emitting Diode)32b、G-LED(Green Light Emitting Diode)32c、及びR-LED(Red Light Emitting Diode)32dの4色のLEDを有する。V-LED32a、B-LED32b、G-LED32c、及びR-LED32dは、例えば、410nm、450nm、530nm、615nmにそれぞれピーク波長を持つ観察光であり、紫色(V)光、青色(B)光、緑色(G)光、及び赤色(R)光を発光する半導体光源である。 The light source unit 32 has, for example, a plurality of semiconductor light sources. In the present embodiment, the light source unit 32 includes a V-LED (Violet Light Emitting Diode) 32a, a B-LED (Blue Light Emitting Diode) 32b, a G-LED (Green Light Emitting Diode) 32c, and an R-LED (Red Light). Emitting (Diode) 32d has four color LEDs. The V-LED 32a, the B-LED 32b, the G-LED 32c, and the R-LED 32d are, for example, observation lights having peak wavelengths at 410 nm, 450 nm, 530 nm, and 615 nm, respectively, and include violet (V) light, blue (B) light, It is a semiconductor light source that emits green (G) light and red (R) light.
 光源制御部31は、撮影モード設定部により設定される撮影モードに応じて、光源ユニット32の4つのLEDの点灯や消灯、点灯時の発光量等を、LED毎に個別に制御する。通常光画像撮影モードの場合、光源制御部31は、V-LED32a、B-LED32b、G-LED32c、及びR-LED32dを全て点灯させる。このため、通常光画像撮影モードでは、V光、B光、G光、及びR光を含む白色光が観察光として用いられる。 (4) The light source control unit 31 individually controls the on / off of the four LEDs of the light source unit 32, the light emission amount at the time of lighting, and the like for each LED according to the shooting mode set by the shooting mode setting unit. In the case of the normal light image capturing mode, the light source control unit 31 turns on all of the V-LED 32a, the B-LED 32b, the G-LED 32c, and the R-LED 32d. Therefore, in the normal light image capturing mode, white light including V light, B light, G light, and R light is used as observation light.
 一方、特殊光画像撮影モードの場合、光源制御部31は、V-LED32a、B-LED32b、G-LED32c、及びR-LED32dのうちのいずれか1つの光源、又は適宜組み合わせた複数の光源を点灯させ、又は複数の光源を点灯させる場合に各光源の発光量(光量比)を制御し、これにより被検体の深度の異なる複数の層の画像の撮像を可能にする。 On the other hand, in the special light image capturing mode, the light source control unit 31 turns on any one of the V-LED 32a, the B-LED 32b, the G-LED 32c, and the R-LED 32d, or a plurality of light sources appropriately combined. When a plurality of light sources are turned on or off, the light emission amount (light amount ratio) of each light source is controlled, whereby images of a plurality of layers having different depths of the subject can be captured.
 また、マルチフレーム撮影モードは、通常光画像と1以上の特殊光画像とをフレーム毎に切り換えて撮影し、又は2以上の特殊光画像をフレーム毎に切り換えて撮影する撮影モードであり、マルチフレーム撮影モードの場合、光源制御部31は、フレーム毎に異なる観察光を光源ユニット32から発光させる。 The multi-frame shooting mode is a shooting mode in which a normal light image and one or more special light images are switched and photographed for each frame, or two or more special light images are switched and photographed for each frame. In the case of the photographing mode, the light source control unit 31 causes the light source unit 32 to emit different observation light for each frame.
 各LED32a~32dが発する各色の光は、ダイクロイックミラーやレンズ等で形成される光路結合部、及び絞り機構(図示せず)を介して内視鏡スコープ11内に挿通されたライトガイド40に入射される。 Light of each color emitted from each of the LEDs 32a to 32d is incident on a light guide 40 inserted into the endoscope 11 through an optical path coupling portion formed by a dichroic mirror, a lens, and the like, and a diaphragm mechanism (not shown). Is done.
 尚、光源装置12の観察光は、白色光(白色の波長帯域の光又は複数の波長帯域の光)、或いは1又は複数の特定の波長帯域にピークを有する光(特殊光)、或いはこれらの組み合わせなど、観察目的に応じた各種の波長帯域の光が選択される。 The observation light of the light source device 12 may be white light (light in a white wavelength band or light in a plurality of wavelength bands), light having a peak in one or more specific wavelength bands (special light), or a light having a peak in one or more specific wavelength bands. Light of various wavelength bands according to the observation purpose, such as a combination, is selected.
 特定の波長帯域の第1例は、例えば可視域の青色帯域又は緑色帯域である。この第1例の波長帯域は、390nm以上450nm以下又は530nm以上550nm以下の波長帯域を含み、且つ第1例の光は、390nm以上450nm以下又は530nm以上550nm以下の波長帯域内にピーク波長を有する。 第 A first example of the specific wavelength band is, for example, a blue band or a green band in a visible region. The wavelength band of the first example includes a wavelength band of 390 nm to 450 nm or 530 nm to 550 nm, and the light of the first example has a peak wavelength in the wavelength band of 390 nm to 450 nm or 530 nm to 550 nm. .
 特定の波長帯域の第2例は、例えば可視域の赤色帯域である。この第2例の波長帯域は、585nm以上615nm以下又は610nm以上730nm以下の波長帯域を含み、且つ第2例の光は、585nm以上615nm以下又は610nm以上730nm以下の波長帯域内にピーク波長を有する。 第 The second example of the specific wavelength band is, for example, a red band in a visible region. The wavelength band of the second example includes a wavelength band of 585 nm to 615 nm or 610 nm to 730 nm, and the light of the second example has a peak wavelength in the wavelength band of 585 nm to 615 nm or 610 nm to 730 nm. .
 特定の波長帯域の第3例は、酸化ヘモグロビンと還元ヘモグロビンとで吸光係数が異なる波長帯域を含み、且つ第3例の光は、酸化ヘモグロビンと還元ヘモグロビンとで吸光係数が異なる波長帯域にピーク波長を有する。この第3例の波長帯域は、400±10nm、440±10nm、470±10nm、又は600nm以上750nm以下の波長帯域を含み、且つ第3例の光は、上記400±10nm、440±10nm、470±10nm、又は600nm以上750nm以下の波長帯域にピーク波長を有する。 The third example of the specific wavelength band includes a wavelength band in which the extinction coefficient differs between oxyhemoglobin and reduced hemoglobin, and the light of the third example has a peak wavelength in a wavelength band in which the extinction coefficient differs between oxyhemoglobin and reduced hemoglobin. Having. The wavelength band of the third example includes 400 ± 10 nm, 440 ± 10 nm, 470 ± 10 nm, or a wavelength band of 600 nm or more and 750 nm or less, and the light of the third example includes the above 400 ± 10 nm, 440 ± 10 nm, 470 nm. It has a peak wavelength in a wavelength band of ± 10 nm, or 600 nm to 750 nm.
 特定の波長帯域の第4例は、生体内の蛍光物質が発する蛍光の観察(蛍光観察)に用いられ且つこの蛍光物質を励起させる励起光の波長帯域(390nmから470nm)である。 The fourth example of the specific wavelength band is a wavelength band (390 nm to 470 nm) of excitation light used for observation of fluorescence emitted from a fluorescent substance in a living body (fluorescence observation) and for exciting this fluorescent substance.
 特定の波長帯域の第5例は、赤外光の波長帯域である。この第5例の波長帯域は、790nm以上820nm以下又は905nm以上970nm以下の波長帯域を含み、且つ第5例の光は、790nm以上820nm以下又は905nm以上970nm以下の波長帯域にピーク波長を有する。 The fifth example of the specific wavelength band is a wavelength band of infrared light. The wavelength band of the fifth example includes a wavelength band of 790 nm to 820 nm or 905 nm to 970 nm, and the light of the fifth example has a peak wavelength in a wavelength band of 790 nm to 820 nm or 905 nm to 970 nm.
 内視鏡プロセッサ13は、プロセッサ操作部13a、プロセッサ制御部61、ROM62、デジタル信号処理回路(DSP:Digital Signal Processor)63、画像処理部65、表示制御部66、及び記憶部67等を有している。 The endoscope processor 13 includes a processor operation unit 13a, a processor control unit 61, a ROM 62, a digital signal processing circuit (DSP: Digital Signal Processor) 63, an image processing unit 65, a display control unit 66, a storage unit 67, and the like. ing.
 プロセッサ操作部13aは、電源ボタン、マウスにより表示器14の画面上で指示される座標位置及びクリック(実行指示)等の入力を受け付ける入力部等を含む。 The processor operation unit 13a includes a power button, an input unit for receiving inputs such as a coordinate position indicated on the screen of the display unit 14 by a mouse and a click (execution instruction).
 プロセッサ制御部61は、プロセッサ操作部13aでの操作情報、及び内視鏡制御部47を介して受信した内視鏡操作部46での操作情報に応じてROM62から必要なプログラムやデータを読み出し、逐次処理することで内視鏡プロセッサ13の各部を制御するとともに、光源装置12を制御する。尚、プロセッサ制御部61は、図示しないインターフェースを介して接続されたキーボード等の他の外部機器から必要な指示入力を受け付けるようにしてもよい。 The processor control unit 61 reads out necessary programs and data from the ROM 62 according to the operation information at the processor operation unit 13a and the operation information at the endoscope operation unit 46 received via the endoscope control unit 47, By performing the sequential processing, each part of the endoscope processor 13 is controlled, and the light source device 12 is controlled. The processor control unit 61 may receive a necessary instruction input from another external device such as a keyboard connected via an interface (not shown).
 内視鏡スコープ11(撮像素子45)から出力される動画の各フレームの画像データを取得する画像取得部の一形態として機能するDSP63は、プロセッサ制御部61の制御の下、内視鏡スコープ11から入力される動画の1フレーム分の画像データに対し、欠陥補正処理、オフセット処理、ホワイトバランス補正、ガンマ補正、及びデモザイク処理(「同時化処理」ともいう)等の各種の信号処理を行い、1フレーム分の画像データを生成する。 The DSP 63, which functions as one mode of an image acquisition unit that acquires image data of each frame of a moving image output from the endoscope 11 (the imaging element 45), is controlled by the processor control unit 61. Performs various signal processing such as defect correction processing, offset processing, white balance correction, gamma correction, and demosaicing processing (also referred to as “simultaneous processing”) on image data for one frame of a moving image input from Image data for one frame is generated.
 画像処理部65は、DSP63から画像データを入力し、入力した画像データに対して、必要に応じて色変換処理、色彩強調処理、及び構造強調処理等の画像処理を施し、観察対象が写った内視鏡画像を示す画像データを生成する。色変換処理は、画像データに対して3×3のマトリックス処理、階調変換処理、及び3次元ルックアップテーブル処理などにより色の変換を行う処理である。色彩強調処理は、色変換処理済みの画像データに対して、例えば血管と粘膜との色味に差をつける方向に色彩を強調する処理である。構造強調処理は、例えば血管やピットパターン等の観察対象に含まれる特定の組織や構造を強調する処理であり、色彩強調処理後の画像データに対して行う。 The image processing unit 65 receives image data from the DSP 63, performs image processing such as color conversion processing, color emphasis processing, and structure emphasis processing on the input image data as necessary, to capture the observation target. Image data representing an endoscope image is generated. The color conversion process is a process of performing color conversion on image data by 3 × 3 matrix processing, gradation conversion processing, three-dimensional lookup table processing, or the like. The color emphasis process is a process of emphasizing the color of the image data that has been subjected to the color conversion process, for example, in a direction that makes a difference in the color of blood vessels and mucous membranes. The structure emphasis process is a process of emphasizing a specific tissue or structure included in an observation target such as a blood vessel or a pit pattern, and is performed on image data after the color emphasis process.
 画像処理部65により処理された動画の各フレームの画像データは、静止画又は動画の撮影指示があると、撮影指示された静止画又は動画として記憶部67に記録される。 The image data of each frame of the moving image processed by the image processing unit 65 is recorded in the storage unit 67 as a still image or a moving image instructed to be shot when a shooting instruction of a still image or a moving image is issued.
 表示制御部66は、画像処理部65から入力する画像データに基づいて通常光画像又は特殊光画像を表示器14に表示させるための表示用データを生成し、生成した表示用データを表示器14に出力し、表示器14に表示画像(内視鏡スコープ11により撮像された動画等)を表示させる。 The display control unit 66 generates display data for displaying the normal light image or the special light image on the display unit 14 based on the image data input from the image processing unit 65, and displays the generated display data on the display unit 14. And the display device 14 displays a display image (such as a moving image captured by the endoscope 11).
 マルチフレーム撮影モードの場合、異なる観察光を用いて順次取得された複数の画像をそのまま順次表示すると、見え方が変化してチラつくため、表示制御部66は、複数の画像のうちの何れかの画像(一部の画像)を表示器14に表示させ、または画像処理部65により複数の画像を用いて算出された観察用画像を表示器14に表示させる。 In the case of the multi-frame shooting mode, when a plurality of images sequentially obtained using different observation lights are sequentially displayed as they are, the appearance changes and the image flickers. Therefore, the display control unit 66 sets any one of the plurality of images. (A part of the image) is displayed on the display unit 14, or the observation image calculated by the image processing unit 65 using the plurality of images is displayed on the display unit 14.
 また、表示制御部66は、認識器15から画像処理部65を介して入力する認識結果、又は認識器15から入力する認識結果を表示器14に表示させる。 (4) The display controller 66 causes the display 14 to display a recognition result input from the recognizer 15 via the image processor 65 or a recognition result input from the recognizer 15.
 表示制御部66は、認識器15により注目領域が検出された場合、その注目領域を示す指標を、表示器14に表示された画像上に重畳して表示させる。例えば、表示画像における注目領域の色を変えるなどの強調表示や、マーカの表示、バウンディングボックスの表示が、指標として考えられる。 When the target area is detected by the recognizer 15, the display control unit 66 displays an index indicating the target area so as to be superimposed on the image displayed on the display 14. For example, highlighting such as changing the color of the region of interest in the display image, displaying a marker, and displaying a bounding box can be considered as indices.
 また、表示制御部66は、認識器15による注目領域の検出結果に基づいて、注目領域の有無を表す情報を表示器14に表示された画像と重ならないように表示させることができる。注目領域の有無を表す情報は、例えば、内視鏡画像の枠の色を、注目領域が検出された場合と注目領域が検出されない場合とで変えたり、「注目領域有り!」のテキストを内視鏡画像とは異なる表示領域に表示させる態様が考えられる。 The display control unit 66 can display information indicating the presence or absence of the attention area based on the detection result of the attention area by the recognizer 15 so as not to overlap the image displayed on the display 14. The information indicating the presence / absence of the attention area includes, for example, changing the color of the frame of the endoscope image between the case where the attention area is detected and the case where the attention area is not detected, and the method of displaying the text “attention area exists!” A mode in which the image is displayed in a display area different from the endoscope image is considered.
 また、表示制御部66は、認識器15により病変に関する鑑別が実行された場合、その鑑別結果を表示器14に表示させる。鑑別結果の表示方法は、例えば、表示器14の表示画像上に検出結果を表すテキストの表示などが考えられる。テキストの表示は、表示画像上でなくてもよく、表示画像との対応関係が分かりさえすれば、特に限定されない。 {Circle around (4)} When the discriminator 15 performs discrimination regarding a lesion, the display controller 66 causes the display 14 to display the discrimination result. As a display method of the discrimination result, for example, display of text indicating the detection result on a display image of the display unit 14 or the like can be considered. The display of the text need not be on the display image, and is not particularly limited as long as the correspondence between the text and the display image is understood.
 [認識器15]
 次に、本発明に係る認識器15について説明する。
[Recognizer 15]
Next, the recognizer 15 according to the present invention will be described.
 認識器15は、内視鏡プロセッサ13による画像処理後の画像を受け付けるが、まず、認識器15が受け付ける認識用の画像について説明する。 The recognizer 15 receives the image after the image processing by the endoscope processor 13. First, the recognition image received by the recognizer 15 will be described.
 本例の認識器15は、マルチフレーム撮影モードが設定される場合に適用される。 認識 The recognizer 15 of this example is applied when the multi-frame shooting mode is set.
 マルチフレーム撮影モードが設定されると、光源装置12は、紫色光、青色光、緑色光、及び赤色光を含む白色光と、V-LED32a、B-LED32b、G-LED32c、及びR-LED32dの点灯が制御された1又は複数の特定の波長帯域の光(特殊光)とを順次発生し、内視鏡プロセッサ13は、内視鏡スコープ11から白色光下の画像(通常光画像)と特殊光下の画像(特殊光画像)とを順次に取得する。 When the multi-frame shooting mode is set, the light source device 12 outputs white light including violet light, blue light, green light, and red light, and V-LED 32a, B-LED 32b, G-LED 32c, and R-LED 32d. Light (special light) of one or a plurality of specific wavelength bands whose lighting is controlled is sequentially generated, and the endoscope processor 13 outputs an image (normal light image) under white light from the endoscope 11 and a special light. Images under light (special light images) are sequentially acquired.
 本例のマルチフレーム撮影モードでは、図3に示すように第1内視鏡画像である通常光画像(WL(White Light)画像)と、第2内視鏡画像である2種類の特殊光画像(BLI(Blue Light Imaging or Blue LASER Imaging)画像)、LCI(Linked Color Imaging)画像)とを、フレーム毎に順次切り換えて繰り返し取得する。 In the multi-frame photographing mode of this example, as shown in FIG. 3, a normal light image (WL (White @ Light) image) as the first endoscope image and two types of special light images as the second endoscope image. (BLI (Blue Light Imaging or Blue LASER Imaging) image) and LCI (Linked Color Imaging) image are sequentially switched for each frame and repeatedly acquired.
 ここで、BLI画像及びLCI画像は、それぞれBLI用の観察光、及びLCI用の観察光で撮像された画像である。 Here, the BLI image and the LCI image are images captured with the BLI observation light and the LCI observation light, respectively.
 BLI用の観察光は、表層血管での吸収率が高いV光の比率が高く、中層血管での吸収率が高いG光の比率を抑えた観察光であり、被検体の粘膜表層の血管や構造の強調に適した画像(BLI画像)の生成に適している。 The observation light for BLI is observation light in which the ratio of V light having a high absorptivity in surface blood vessels is high and the ratio of G light having a high absorptivity in middle blood vessels is suppressed. It is suitable for generating an image (BLI image) suitable for structure enhancement.
 また、LCI用の観察光は、V光の比率がWL用の観察光に比べて高く、WL用の観察光と比べて微細な色調変化を捉えるのに適した観察光であり、LCI画像は、R成分の信号も利用して粘膜付近の色を中心に、赤味を帯びている色はより赤く、白っぽい色はより白くなるような色強調処理が行われた画像である。 Further, the observation light for LCI has a higher ratio of the V light than the observation light for WL, and is an observation light suitable for capturing a minute change in color tone as compared with the observation light for WL. This is an image that has been subjected to a color enhancement process using the signal of the R component so that the reddish color becomes redder and the whitish color becomes more white, centering on the color near the mucous membrane.
 認識器15は、内視鏡プロセッサ13にて順次取得された複数の画像(本例では、WL画像、BLI画像及びLCI画像)からなる画像セットSaを、認識用の画像として受け付ける。 The recognizer 15 receives an image set Sa including a plurality of images (in this example, a WL image, a BLI image, and an LCI image) sequentially acquired by the endoscope processor 13 as images for recognition.
 WL画像、BLI画像及びLCI画像は、それぞれカラー画像であるため、R画像、G画像及びB画像(3つの色チャンネル)を有している。したがって、認識器15が入力する画像セットSaは、チャンネル数が9(=3×3)の画像となる。 Since the WL image, the BLI image, and the LCI image are color images, they each have an R image, a G image, and a B image (three color channels). Therefore, the image set Sa input by the recognizer 15 is an image having 9 (= 3 × 3) channels.
 また、認識器15は、画像セットSaを順次入力するが、画像セットSaは、時系列順の連続する3つのフレーム(WL画像、BLI画像及びLCI画像)から構成されるため、各画像セットSaを入力する時間間隔は、マルチフレーム撮影モードで撮像される各フレームの3フレーム分の時間に相当する。即ち、認識器15が受け付ける時刻tの画像セットSaと、1つ前の時刻tn-1の画像セットSaとの時間間隔は、マルチフレーム撮影モードで撮像される各フレームの3フレーム分の時間に相当する。 The recognizer 15 sequentially inputs the image sets Sa. Since the image set Sa is composed of three consecutive frames (WL image, BLI image, and LCI image) in chronological order, each image set Sa Is equivalent to three frames of each frame imaged in the multi-frame shooting mode. That is, the time interval between the image set Sa at the time t n and the image set Sa at the previous time t n−1 received by the recognizing unit 15 is equivalent to three frames of each frame captured in the multi-frame shooting mode. Equivalent to time.
 図4は、認識器15を構成する学習モデルの一つである畳み込みニューラルネットワーク(CNN:Convolutional Neural Network)の代表的な構成例を示す模式図である。 FIG. 4 is a schematic diagram showing a typical configuration example of a convolutional neural network (CNN: Convolutional Neural Network), which is one of the learning models constituting the recognizer 15.
 CNN15は、例えば、内視鏡画像に写っている注目領域(病変、手術痕、処置痕、処置具など)の位置検出や病変の種類を鑑別する学習モデルであり、複数のレイヤー構造を有し、複数の重みパラメータを保持している。CNN15は、重みパラメータが最適値に設定されることで、学習済みモデルとなり認識器として機能する。 The CNN 15 is, for example, a learning model for detecting the position of a region of interest (lesion, surgical mark, treatment mark, treatment tool, etc.) in the endoscope image and discriminating the type of lesion, and has a plurality of layer structures. , And holds a plurality of weight parameters. The CNN 15 becomes a trained model when the weight parameter is set to the optimum value, and functions as a recognizer.
 図4に示すようにCNN15は、入力層15Aと、複数の畳み込み層及び複数のプーリング層を有する中間層15Bと、出力層15Cとを備え、各層は複数の「ノード」が「エッジ」で結ばれる構造となっている。 As shown in FIG. 4, the CNN 15 includes an input layer 15A, an intermediate layer 15B having a plurality of convolutional layers and a plurality of pooling layers, and an output layer 15C. In each layer, a plurality of "nodes" are connected by "edges". The structure is
 本例のCNN15は、内視鏡画像に写っている注目領域の位置を認識するセグメンテーションを行う学習モデルであり、CNNの一種である全層畳み込みネットワーク(FCN:Fully Convolution Network)が適用され、内視鏡画像に写っている注目領域の位置を画素レベルで把握できるものである。 The CNN 15 of the present example is a learning model for performing segmentation for recognizing the position of a region of interest in an endoscope image. A full-layer convolution network (FCN), which is a type of CNN, is applied to the CNN 15. The position of the attention area in the endoscope image can be grasped at the pixel level.
 入力層15Aには、認識用の画像セットSa(図3)が入力される。 認識 The image set Sa for recognition (FIG. 3) is input to the input layer 15A.
 中間層15Bは、入力層15Aから入力した画像セットSaから特徴を抽出する部分である。中間層15Bにおける畳み込み層は、画像セットSaや前の層で近くにあるノードにフィルタ処理し(フィルタを使用した畳み込み演算を行い)、「特徴マップ」を取得する。プーリング層は、畳み込み層から出力された特徴マップを縮小(又は拡大)して新たな特徴マップとする。「畳み込み層」は、画像からのエッジ抽出等の特徴抽出の役割を担い、「プーリング層」は抽出された特徴が、平行移動などによる影響を受けないようにロバスト性を与える役割を担う。尚、中間層15Bには、畳み込み層とプーリング層とを1セットとする場合に限らず、畳み込み層が連続する場合や正規化層も含まれ得る。 The intermediate layer 15B is a part for extracting features from the image set Sa input from the input layer 15A. The convolutional layer in the intermediate layer 15B performs a filtering process on a nearby node in the image set Sa or the previous layer (performs a convolution operation using a filter) to obtain a “feature map”. The pooling layer reduces (or enlarges) the feature map output from the convolutional layer to create a new feature map. The “convolution layer” has a role of extracting features such as edge extraction from an image, and the “pooling layer” has a role of providing robustness so that the extracted features are not affected by translation or the like. The intermediate layer 15B is not limited to the case where the convolutional layer and the pooling layer are set as one set, but may include a case where the convolutional layer is continuous or a normalization layer.
 出力層15Cは、中間層15Bにより抽出された特徴に基づき内視鏡画像に写っている注目領域の位置検出や病変の種類を分類(鑑別)する認識結果を出力する部分である。 The output layer 15C is a part that outputs a recognition result for detecting the position of the attention area in the endoscope image and classifying (discriminating) the type of lesion based on the features extracted by the intermediate layer 15B.
 また、このCNN15は、学習用の画像セットSaと画像セットSaに対する正解データとの多数のセットにより学習されたものであり、CNN15の各畳み込み層に適用されるフィルタの係数やオフセット値が、学習用のデータセットにより最適値に設定されている。ここで、正解データとは、内視鏡画像(本例では、画像セットSaの少なくとも1つの画像)に対して医師が指定した注目領域や鑑別結果であることが好ましい。 The CNN 15 is learned from a large number of sets of a learning image set Sa and correct data for the image set Sa. The coefficients and offset values of the filter applied to each convolutional layer of the CNN 15 are determined by learning. Is set to the optimal value by the data set for Here, the correct answer data is preferably a region of interest or a discrimination result specified by a doctor with respect to an endoscopic image (in this example, at least one image of the image set Sa).
 図5は、図4に示したCNN15の中間層15Bの構成例を示す模式図である。 FIG. 5 is a schematic diagram showing a configuration example of the intermediate layer 15B of the CNN 15 shown in FIG.
 注目領域の最初(1番目)の畳み込み層では、認識用の画像セットSaと、フィルタFとの畳み込み演算が行われる。ここで、画像セットSaは、縦がH、横がWの画像サイズを有するN枚(Nチャンネル)の画像である。本例では、図3に示したように画像セットSaは、9チャンネルの画像である。 The convolution layer of the first (1st) of the region of interest, and the image set Sa for recognition, the convolution operation of the filter F 1 is performed. Here, the image set Sa is N (N-channel) images having an image size of H vertically and W horizontally. In this example, as shown in FIG. 3, the image set Sa is an image of 9 channels.
 この画像セットSaと畳み込み演算されるフィルタFは、画像セットSがNチャンネル(N枚)であるため、例えばサイズ5のフィルタの場合、フィルタサイズは、5×5×Nのフィルタになる。 Filter F 1 of this are image set Sa and convolution operation, since the image set S is N-channel (N sheets), for example, in the case of the filter size 5, the filter size will filter 5 × 5 × N.
 このフィルタFを用いた畳み込み演算により、1つのフィルタFに対して1チャンネル(1枚)の「特徴マップ」が生成される。図5に示す例では、M個のフィルタFを使用することで、Mチャンネルの「特徴マップ」が生成される。 The convolution operation using the filter F 1, "feature map" is created in one single channel to the filter F 1 (1 sheet). In the example shown in FIG. 5, the use of the M filter F 1, "feature map" of M channels is generated.
 2番目の畳み込み層で使用されるフィルタFは、例えばサイズ3のフィルタの場合、フィルタサイズは、3×3×Mのフィルタになる。 Filter F 1 used in the second convolution layer, for example, in the case of size 3 filter, the filter size will filter 3 × 3 × M.
 n番目の畳み込み層における「特徴マップ」のサイズが、2番目の畳み込み層における「特徴マップ」のサイズよりも小さくなっているのは、前段までの畳み込み層によりダウンスケーリングされているからである。 The size of the “feature map” in the n-th convolutional layer is smaller than the size of the “feature map” in the second convolutional layer because the downscaling is performed by the convolutional layers up to the previous stage.
 中間層15Bの前半部分の畳み込み層は特徴量の抽出を担い、後半部分の畳み込み層は対象物(注目領域)のセグメンテーションを担う。尚、後半部分の畳み込み層では、アップスケーリングされ、最後の畳み込み層では、入力した画像セットSaと同じサイズの1枚の「特徴マップ」が得られる。CNN15の出力層15C(図4)は、中間層15Bから得られる「特徴マップ」により、画像セットSaの画像に写っている注目領域の位置を画素レベルで把握する。即ち、内視鏡画像の画素毎に注目領域に属するか否かを検出し、その検出結果を出力することができる。 {Circle around (1)} The convolutional layer in the first half of the intermediate layer 15B is responsible for extraction of feature values, and the convolutional layer in the second half is responsible for segmentation of the object (region of interest). In the convolution layer in the latter half, upscaling is performed, and in the last convolution layer, one “feature map” having the same size as the input image set Sa is obtained. The output layer 15C (FIG. 4) of the CNN 15 grasps the position of the attention area in the image of the image set Sa at the pixel level by using the “feature map” obtained from the intermediate layer 15B. That is, it is possible to detect whether each pixel of the endoscope image belongs to the attention area and output the detection result.
 本実施形態によれば、WL画像、BLI画像及びLCI画像のうちのいずれか1つ(1種類)の画像により認識する場合に比べて、マルチフレーム撮影モードで順次取得される複数の画像(WL画像、BLI画像及びLCI画像の画像セット)を用いて認識するため、認識精度を向上させることができる。 According to the present embodiment, a plurality of images (WL) sequentially acquired in the multi-frame shooting mode are compared with a case where recognition is performed using any one (one type) of the WL image, the BLI image, and the LCI image. Image, a BLI image, and an LCI image), the recognition accuracy can be improved.
 また、本例のCNN15は、内視鏡画像に写っている注目領域の位置を認識するものであるが、本発明に係る認識器(CNN)は、これに限らず、病変に関する鑑別を実行して鑑別結果を出力するものでもよい。例えば、認識器は、内視鏡画像を「腫瘍性」、「非腫瘍性」、「その他」の3つのカテゴリに分類し、鑑別結果として「腫瘍性」、「非腫瘍性」及び「その他」に対応する3つのスコア(3つのスコアの合計は100%)として出力したり、3つのスコアから明確に分類できる場合には、分類結果を出力するものでもよい。また、このような鑑別結果を出力するCNNの場合、全層畳み込みネットワーク(FCN)の代わりに、中間層の最後の1層又は複数の層として全結合層を有するものが好ましい。 Further, the CNN 15 of the present example recognizes the position of the attention area in the endoscope image, but the recognizer (CNN) according to the present invention is not limited to this, and performs discrimination regarding a lesion. Output the discrimination result. For example, the recognizer classifies the endoscope image into three categories of “neoplastic”, “non-neoplastic”, and “other”, and discriminates the result as “neoplastic”, “non-neoplastic”, and “other” May be output as three scores (the total of the three scores is 100%), or if the three scores can be clearly classified, a classification result may be output. Further, in the case of a CNN that outputs such a discrimination result, a CNN having a fully connected layer as the last one or more layers of the intermediate layer is preferable instead of the full-layer convolutional network (FCN).
 [内視鏡システムの作用]
 図6は、本発明に係る内視鏡システム10の作用を説明するために用いた主要構成を示すブロック図である。
[Operation of endoscope system]
FIG. 6 is a block diagram showing a main configuration used for explaining the operation of the endoscope system 10 according to the present invention.
 光源ユニット32のV-LED32a、B-LED32b、G-LED32c、及びR-LED32dからは、それぞれ異なるピーク波長をもつ観察光(V光、B光、G光、及びR光)が、ライトガイド40を介して被検体20に照射される。V光、B光、G光、及びR光は、それぞれ被検体20の深度の異なる複数の層に到達するため、これらの観察光により被検体20の深度の異なる画像の撮像が可能である。 From the V-LED 32a, B-LED 32b, G-LED 32c, and R-LED 32d of the light source unit 32, observation light (V light, B light, G light, and R light) having different peak wavelengths are respectively transmitted to the light guide 40. The subject 20 is irradiated via the. Since the V light, the B light, the G light, and the R light respectively reach a plurality of layers at different depths of the subject 20, images of the subject 20 at different depths can be captured by these observation lights.
 尚、図3で説明したようにマルチフレーム撮影モードでは、複数の異なる観察光(例えば、WL用の第1観察光、BLI用の第2観察光、及びLCI用の第3観察光)によりWL画像、BLI画像及びLCI画像が順次取得されるが、WL用、BLI用、及びLCI用の観察光は、前述したようにV光、B光、G光、及びR光の光量比が異なるものである。 As described with reference to FIG. 3, in the multi-frame photographing mode, a plurality of different observation lights (for example, first observation light for WL, second observation light for BLI, and third observation light for LCI) are used for WL. An image, a BLI image, and an LCI image are sequentially acquired, but the observation light for WL, BLI, and LCI has different light intensity ratios of V light, B light, G light, and R light as described above. It is.
 内視鏡スコープ11では、複数の異なる観察光の照射によりWL画像、BLI画像及びLCI画像が順次繰り返し撮像される。WL画像、BLI画像及びLCI画像は、それぞれカラー画像であるため、内視鏡プロセッサ13では、RGBの3チャンネルのWL画像、BLI画像及びLCI画像が生成される。 (5) In the endoscope 11, a WL image, a BLI image, and an LCI image are sequentially and repeatedly captured by irradiation of a plurality of different observation lights. Since the WL image, the BLI image, and the LCI image are each a color image, the endoscope processor 13 generates an RGB three-channel WL image, a BLI image, and an LCI image.
 認識器15は、WL画像、BLI画像及びLCI画像からなる画像セットSa(合計、9チャンネルの画像)を認識用の画像として受け付ける。 The recognizer 15 receives an image set Sa (images of 9 channels in total) including a WL image, a BLI image, and an LCI image as images for recognition.
 認識器15は、内視鏡画像に写っている注目領域(本例では、病変領域)の位置を検出し、病変領域を示す位置情報(認識結果)を内視鏡プロセッサ13に出力する。 The recognizer 15 detects the position of the region of interest (in this example, the lesion region) in the endoscope image, and outputs position information (recognition result) indicating the lesion region to the endoscope processor 13.
 内視鏡プロセッサ13の画像処理部65は、内視鏡スコープ11から入力する画像信号からWL画像、BLI画像及びLCI画像を生成するとともに、観察用画像を生成する。観察用画像は、複数の画像の一部(例えば、WL画像、BLI画像及びLCI画像のうちのWL画像)を観察用画像としてもよいし、複数の画像を用いて算出された画像(WL画像、BLI画像及びLCI画像の2以上の画像を合成した画像)を観察用画像としてもよい。尚、異なる観察光を用いて順次取得された複数の画像をそのまま観察用画像として順次表示すると、見え方が変化してチラつくため、観察用画像は、1種類の画像であることが好ましい。 The image processing unit 65 of the endoscope processor 13 generates a WL image, a BLI image, and an LCI image from an image signal input from the endoscope 11 and also generates an observation image. As the observation image, a part of a plurality of images (for example, a WL image among a WL image, a BLI image, and an LCI image) may be used as the observation image, or an image calculated using the plurality of images (the WL image) , An image obtained by combining two or more images of the BLI image and the LCI image) may be used as the observation image. In addition, if a plurality of images sequentially acquired using different observation lights are sequentially displayed as they are as observation images, the appearance changes and flickers. Therefore, the observation images are preferably one type of image.
 表示制御部66は、画像処理部65から観察用画像を入力し、認識器15から病変領域を示す位置情報を入力し、これらの観察用画像及び認識結果を表示器14に表示させる。 The display control unit 66 inputs the observation image from the image processing unit 65, inputs the position information indicating the lesion area from the recognizing device 15, and causes the display device 14 to display the observation image and the recognition result.
 本例では、表示制御部66は、観察用画像26を表示器14に表示させるとともに、認識された注目領域(病変領域)を強調する強調処理を施す。表示制御部66による強調処理は、表示器14に表示された観察用画像26上に、病変領域を示す指標28を重畳して表示させることで、病変領域を強調表示させる。ここで、指標28の表示は、病変領域の色を変えるなどの強調表示や病変領域の輪郭を示す境界線の表示の他、病変領域を示すマーカの表示、バウンディングボックスの表示が考えられる。 In this example, the display control unit 66 displays the observation image 26 on the display device 14 and performs an emphasis process for emphasizing the recognized attention area (lesion area). In the highlighting process by the display control unit 66, the lesion area is highlighted by superimposing and displaying the index 28 indicating the lesion area on the observation image 26 displayed on the display unit 14. Here, as the display of the index 28, in addition to highlighting such as changing the color of the lesion area, display of a boundary line indicating the outline of the lesion area, display of a marker indicating the lesion area, and display of a bounding box can be considered.
 このように、表示器14に表示される観察用画像26上に注目領域を示す指標28を重畳表示することで、注目領域の見落としがないように検査の支援を行うことができる。 In this way, by superimposing and displaying the index 28 indicating the attention area on the observation image 26 displayed on the display unit 14, it is possible to support the inspection so that the attention area is not overlooked.
 尚、本例の認識器15は、内視鏡画像に写っている注目領域の位置を認識するものであるが、これに限らず、病変に関する鑑別を実行して鑑別結果を出力するものでもよい。鑑別結果の表示方法は、例えば、表示器14の画像上に鑑別結果を表すテキストを表示する方法が考えられる。テキストの表示位置は、画像上でなくてもよく、画像との対応関係が分かりさえすれば、画像とは異なるウインドウでもよく、特に限定されない。 The recognizer 15 of the present example recognizes the position of the attention area in the endoscope image. However, the present invention is not limited to this, and the recognizer 15 may execute a discrimination regarding a lesion and output a discrimination result. . As a display method of the discrimination result, for example, a method of displaying text indicating the discrimination result on the image of the display unit 14 is conceivable. The display position of the text need not be on the image, and may be a window different from the image as long as the correspondence between the text and the image is understood, and is not particularly limited.
 [マルチフレーム撮影の他の実施形態]
 撮像素子45(カラー撮像素子)の代わりに、カラーフィルタを有さないモノクロの撮像素子を備えた内視鏡スコープによりカラーの内視鏡画像を取得する場合、異なる色の観察光により被検体を順次照明し、観察光毎に画像を撮像する(面順次で撮像する)。
[Another embodiment of multi-frame shooting]
When a color endoscope image is obtained by an endoscope having a monochrome image sensor without a color filter instead of the image sensor 45 (color image sensor), the subject is observed with different colors of observation light. The image is sequentially illuminated and an image is taken for each observation light (image is taken in a plane sequence).
 例えば、光源ユニット32から異なる色の観察光(R光、G光、B光、及びV光)を順次発光することで、モノクロの撮像素子によりR光、G光、B光、及びV光に対応した色のR画像、G画像、B画像、及びV画像が面順次で撮像される。 For example, by sequentially emitting different colors of observation light (R light, G light, B light, and V light) from the light source unit 32, the monochrome image sensor converts the light into R light, G light, B light, and V light. The R, G, B, and V images of the corresponding colors are picked up in a frame-sequential manner.
 図7は、面順次で撮像されるR画像、G画像、B画像、及びV画像と画像セットの一例を示す図である。 FIG. 7 is a diagram showing an example of an image set including an R image, a G image, a B image, and a V image, which are imaged in a frame sequential manner.
 内視鏡プロセッサ13は、複数の異なる観察光(R光、G光、B光、及びV光)を用いて順次取得された複数の画像(R画像、G画像、B画像、及びV画像)に基づいてWL画像、BLI画像及びLCI画像等の観察用画像を生成することができる。これらの観察用画像は、R画像、G画像、B画像、及びV画像の合成比率を調整することで生成することができる。 The endoscope processor 13 includes a plurality of images (R image, G image, B image, and V image) sequentially acquired using a plurality of different observation lights (R light, G light, B light, and V light). , An observation image such as a WL image, a BLI image, and an LCI image can be generated. These observation images can be generated by adjusting the synthesis ratio of the R, G, B, and V images.
 また、R画像、G画像、B画像、及びV画像のうち少なくとも2つの画像を予め設定された係数を掛けて合成(四則演算)した画像を、画像セットに含めても良い。例えば、中心波長410nmの画像(V画像)を中心波長450nmの画像(B画像)で各画素を除算して得られる画像や、中心波長410nmの画像(V画像)を中心波長450nmの画像(B画像)で各画素を乗算して得られる画像を用いても良い。 {Circle around (4)} An image set may include an image obtained by multiplying at least two images among the R image, the G image, the B image, and the V image by a preset coefficient and synthesizing (four arithmetic operations). For example, an image obtained by dividing each pixel by an image (V image) having a center wavelength of 410 nm and an image (B image) having a center wavelength of 450 nm, or an image (V image) having a center wavelength of 410 nm and an image (B image) having a center wavelength of 450 nm An image obtained by multiplying each pixel by (image) may be used.
 認識器15は、内視鏡プロセッサ13により生成されたWL画像、BLI画像及びLCI画像を画像セットSbとして受け付け、内視鏡画像に対する認識結果を内視鏡プロセッサ13に返すことができる。 The recognizer 15 can receive the WL image, the BLI image, and the LCI image generated by the endoscope processor 13 as an image set Sb, and return a recognition result for the endoscope image to the endoscope processor 13.
 本例の認識器15は、WL画像、BLI画像及びLCI画像からなる画像セットSa(合計、9チャンネルの画像)を認識用の画像として受け付けるが、これに限らず、例えば、上記のR画像、G画像、B画像、及びV画像からなる画像セットを受け付け、内視鏡画像に対する認識結果を出力するものでもよい。 The recognizer 15 of this example accepts an image set Sa (images of 9 channels in total) including a WL image, a BLI image, and an LCI image as an image for recognition, but is not limited thereto. An image set including a G image, a B image, and a V image may be accepted, and a recognition result for an endoscopic image may be output.
 [画像処理方法]
 図8は、本発明に係る画像処理方法の実施形態を示すフローチャートであり、図2に示した内視鏡システム10の各部の処理手順に関して示している。
[Image processing method]
FIG. 8 is a flowchart showing an embodiment of the image processing method according to the present invention, and shows a processing procedure of each unit of the endoscope system 10 shown in FIG.
 図8において、マルチフレーム撮影モードが設定され、内視鏡スコープ11は、複数の異なる観察光を用いたマルチフレーム画像を順次撮像する(ステップS10)。 In FIG. 8, the multi-frame shooting mode is set, and the endoscope 11 sequentially captures multi-frame images using a plurality of different observation lights (step S10).
 内視鏡プロセッサ13は、内視鏡スコープ11により撮像されたマルチフレーム画像を構成する画像セットを取得する(ステップS12、第1ステップ)。 (4) The endoscope processor 13 acquires an image set constituting a multi-frame image captured by the endoscope 11 (step S12, first step).
 画像セットは、内視鏡スコープ11により撮像されたWL用、BLI用、及びLCI用の観察光により撮像されたWL画像、BLI画像及びLCI画像、あるいは面順次で撮像されたR画像、G画像、B画像、及びV画像から生成されたWL画像、BLI画像及びLCI画像が考えられるが、特殊光画像は、BLI画像及びLCI画像の何れか1つの特殊光画像であってもよいし、他の特殊光で撮像された特殊光画像であってもよい。また、画像セットは、WL画像(通常光画像)を含まず、第1特殊光で撮像された第1特殊光画像及び第2特殊光で撮像された第2特殊光画像を含む2以上の特殊光画像であってもよい。要は、複数の異なる観察光を用いて順次取得された複数の画像からなる画像セットであればよく、如何なる画像セットでもよい。 The image set includes a WL image, a BLI image, and an LCI image captured by observation light for WL, BLI, and LCI captured by the endoscope 11, or an R image and a G image captured in a frame-sequential manner. , B image, and WL image, BLI image, and LCI image generated from the V image are conceivable, but the special light image may be any one of the BLI image and the LCI image, May be a special light image captured by the special light. Further, the image set does not include the WL image (normal light image) but includes two or more special light images including the first special light image captured by the first special light and the second special light image captured by the second special light. It may be an optical image. In short, any image set may be used as long as it is an image set including a plurality of images sequentially acquired using a plurality of different observation lights, and may be any image set.
 内視鏡プロセッサ13の画像処理部65は、取得した画像セットに基づいて観察用画像を生成する(ステップS14)。観察用画像は、複数の画像の一部(例えば、WL画像、BLI画像及びLCI画像のうちのWL画像)もしくは複数の画像を用いて算出された画像である。 (4) The image processing unit 65 of the endoscope processor 13 generates an observation image based on the acquired image set (Step S14). The observation image is an image calculated using a part of a plurality of images (for example, a WL image among a WL image, a BLI image, and an LCI image) or a plurality of images.
 一方、認識器15は、内視鏡プロセッサ13を介して受け付けた画像セットに基づいて内視鏡画像に写っている注目領域の位置検出や病変の種類の鑑別等を行い、認識結果を出力する(ステップS16、第2ステップ)。 On the other hand, the recognizer 15 performs the position detection of the attention area shown in the endoscope image, the discrimination of the type of the lesion, and the like based on the image set received via the endoscope processor 13 and outputs the recognition result. (Step S16, second step).
 そして、表示制御部66は、生成された観察用画像と認識器15による認識結果とを、表示器14に表示させる(ステップS18、第3ステップ)。 Then, the display controller 66 causes the display 14 to display the generated observation image and the recognition result obtained by the recognizer 15 (step S18, third step).
 続いて、マルチフレーム画像の撮像を終了するか否かが判別され(ステップS20)、マルチフレーム画像の撮像が継続される場合(「No」の場合)には、ステップS10に遷移し、ステップS10からステップS20の処理が繰り返し行われる。これにより、観察用画像が動画として表示され、また、認識器15の認識結果も連続的に表示される。 Subsequently, it is determined whether or not the imaging of the multi-frame image is to be ended (step S20). If the imaging of the multi-frame image is to be continued (in the case of “No”), the process transits to step S10 and proceeds to step S10. To step S20 are repeatedly performed. Thereby, the observation image is displayed as a moving image, and the recognition result of the recognizer 15 is also continuously displayed.
 マルチフレーム画像の撮像が終了する場合(「Yes」の場合)には、本処理を終了させる。 場合 When the imaging of the multi-frame image is completed (in the case of “Yes”), the present process is terminated.
 [その他]
 本実施形態では、内視鏡スコープ11等を含む内視鏡システム10について説明したが、本発明は、内視鏡システム10に限らず、内視鏡プロセッサ13と認識器15とにより構成される画像処理装置でもよい。この場合、内視鏡プロセッサ13と認識器15とは一体化されたものでもよいし、別体のものでもよい。
[Others]
In the present embodiment, the endoscope system 10 including the endoscope scope 11 and the like has been described. However, the present invention is not limited to the endoscope system 10, but includes the endoscope processor 13 and the recognizer 15. An image processing device may be used. In this case, the endoscope processor 13 and the recognizer 15 may be integrated or may be separate.
 また、異なる観察光は、4色のLEDから発光されるものに限らず、例えば、中心波長445nmの青色レーザ光を発する青色レーザダイオードと、中心波長405nmの青紫色レーザ光を発する青紫色レーザダイオードとを発光源とし、これらの青色レーザダイオード、及び青紫色レーザダイオードのレーザ光を、YAG(Yttrium Aluminum Garnet)系の蛍光体に照射して発光されるものでもよい。この蛍光体に青色レーザ光が照射されることで、蛍光体が励起され広帯域の蛍光が発せられ、また、一部の青色レーザ光は、そのまま蛍光体を透過する。青紫色レーザ光は、蛍光体を励起させることなく透過する。したがって、青色レーザ光と青紫色レーザ光との強度を調整することで、WL用の観察光、BLI用の観察光、及びLCI用の観察光を照射することができ、また、青紫色レーザ光のみを発光させると、中心波長が405nmの観察光を照射することができる。 Further, the different observation lights are not limited to those emitted from the four-color LEDs. For example, a blue laser diode that emits blue laser light having a center wavelength of 445 nm and a blue-violet laser diode that emits blue-violet laser light having a center wavelength of 405 nm May be used as a light source, and the laser light of the blue laser diode and the blue-violet laser diode may be emitted to a YAG (Yttrium Aluminum Aluminum Garnet) based phosphor to emit light. By irradiating the phosphor with a blue laser beam, the phosphor is excited to emit broadband fluorescence, and a part of the blue laser beam passes through the phosphor as it is. The blue-violet laser light is transmitted without exciting the phosphor. Therefore, by adjusting the intensity of the blue laser light and the blue-violet laser light, it is possible to irradiate the observation light for WL, the observation light for BLI, and the observation light for LCI. When only light is emitted, observation light having a center wavelength of 405 nm can be emitted.
 また、本発明に係る観察用画像は動画に限らず、記憶部67等に記憶された静止画でもよく、認識器は静止画の画像セットに基づいて認識結果を出力するものでもよい。 The observation image according to the present invention is not limited to a moving image, but may be a still image stored in the storage unit 67 or the like, and the recognizer may output a recognition result based on a still image set.
 更に、認識器は、CNNに限らず、例えばDBN(Deep Belief Network)、SVM(Support Vector Machine)などのCNN以外の機械学習モデルでもよい。 The recognizer is not limited to the CNN, but may be a machine learning model other than the CNN, such as a DBN (Deep Belief Network) and an SVM (Support Vector Machine).
 また、内視鏡プロセッサ13及び/又は認識器15のハードウェア的な構造は、次に示すような各種のプロセッサ(processor)である。各種のプロセッサには、ソフトウェア(プログラム)を実行して各種の制御部として機能する汎用的なプロセッサであるCPU(Central Processing Unit)、FPGA(Field Programmable Gate Array)などの製造後に回路構成を変更可能なプロセッサであるプログラマブルロジックデバイス(Programmable Logic Device:PLD)、ASIC(Application Specific Integrated Circuit)などの特定の処理を実行させるために専用に設計された回路構成を有するプロセッサである専用電気回路などが含まれる。 The hardware structure of the endoscope processor 13 and / or the recognizer 15 is various processors as described below. For various processors, the circuit configuration can be changed after manufacturing such as CPU (Central Processing Unit) and FPGA (Field Programmable Gate Array), which are general-purpose processors that execute software (programs) and function as various control units. Special-purpose circuits such as a programmable logic device (Programmable Logic Device: PLD), an ASIC (Application Specific Integrated Circuit), and a dedicated electric circuit having a circuit configuration specifically designed to execute a specific process are included. It is.
 1つの処理部は、これら各種のプロセッサのうちの1つで構成されていてもよいし、同種又は異種の2つ以上のプロセッサ(例えば、複数のFPGA、あるいはCPUとFPGAの組み合わせ)で構成されてもよい。また、複数の制御部を1つのプロセッサで構成してもよい。複数の制御部を1つのプロセッサで構成する例としては、第1に、クライアントやサーバなどのコンピュータに代表されるように、1つ以上のCPUとソフトウェアの組合せで1つのプロセッサを構成し、このプロセッサが複数の制御部として機能する形態がある。第2に、システムオンチップ(System On Chip:SoC)などに代表されるように、複数の制御部を含むシステム全体の機能を1つのIC(Integrated Circuit)チップで実現するプロセッサを使用する形態がある。このように、各種の制御部は、ハードウェア的な構造として、上記各種のプロセッサを1つ以上用いて構成される。 One processing unit may be configured by one of these various processors, or configured by two or more processors of the same or different types (for example, a plurality of FPGAs or a combination of a CPU and an FPGA). You may. Further, a plurality of control units may be configured by one processor. As an example of configuring a plurality of control units with one processor, first, as represented by a computer such as a client or a server, one processor is configured by a combination of one or more CPUs and software. There is a form in which a processor functions as a plurality of control units. Second, as represented by a system-on-chip (SoC), a form in which a processor that realizes the functions of the entire system including a plurality of control units by one IC (Integrated Circuit) chip is used. is there. As described above, the various control units are configured by using one or more of the various processors described above as a hardware structure.
 更に、本発明は上述した実施形態に限定されず、本発明の精神を逸脱しない範囲で種々の変形が可能であることは言うまでもない。 Furthermore, the present invention is not limited to the above-described embodiment, and it goes without saying that various modifications can be made without departing from the spirit of the present invention.
10 内視鏡システム
11 内視鏡スコープ
12 光源装置
12a 光源操作部
13 内視鏡プロセッサ
13a プロセッサ操作部
14 表示器
15 認識器(CNN)
15A 入力層
15B 中間層
15C 出力層
16 挿入部
16a 挿入部先端部
16b 湾曲部
16c 可撓管部
17 手元操作部
18 ユニバーサルコード
20 被検体
21 アングルノブ
22 操作ボタン
23 鉗子入口
25a コネクタ部
25b コネクタ部
26 観察用画像
28 指標
31 光源制御部
32 光源ユニット
32a V-LED
32b B-LED
32c G-LED
32d R-LED
40 ライトガイド
42 照明レンズ
44 対物レンズ
45 撮像素子
46 内視鏡操作部
47 内視鏡制御部
48,62 ROM
61 プロセッサ制御部
65 画像処理部
66 表示制御部
67 記憶部
F1 フィルタ
S 画像セット
S10 ステップ
S12 ステップ
S14 ステップ
S16 ステップ
S18 ステップ
S20 ステップ
Reference Signs List 10 endoscope system 11 endoscope scope 12 light source device 12a light source operation unit 13 endoscope processor 13a processor operation unit 14 display 15 recognizer (CNN)
15A Input layer 15B Intermediate layer 15C Output layer 16 Insert section 16a Insert section distal end section 16b Bend section 16c Flexible tube section 17 Hand operating section 18 Universal cord 20 Subject 21 Angle knob 22 Operation button 23 Forceps inlet 25a Connector section 25b Connector section 26 Observation image 28 Index 31 Light source control unit 32 Light source unit 32a V-LED
32b B-LED
32c G-LED
32d R-LED
40 light guide 42 illumination lens 44 objective lens 45 image sensor 46 endoscope operation unit 47 endoscope control unit 48, 62 ROM
61 Processor control unit 65 Image processing unit 66 Display control unit 67 Storage unit F1 Filter S Image set S10 Step S12 Step S14 Step S16 Step S18 Step S20 Step

Claims (18)

  1.  複数の異なる観察光を用いて順次取得された複数の画像からなる画像セットを受け付け、前記画像セットに対する認識結果を出力する認識器と、
     前記複数の画像の一部もしくは前記複数の画像を用いて算出された観察用画像及び前記認識結果を表示部に表示させる表示制御部と、
     を備えた画像処理装置。
    A recognizer that receives an image set including a plurality of images sequentially acquired using a plurality of different observation lights, and outputs a recognition result for the image set,
    A display control unit that causes the display unit to display the observation image and the recognition result calculated using a part of the plurality of images or the plurality of images,
    An image processing apparatus comprising:
  2.  前記認識器は、学習用の前記複数の画像と正解データとをセットにして学習した学習済みモデルを有し、認識用の前記複数の画像を受け付ける毎に前記学習済みモデルに基づいて前記認識結果を出力する請求項1に記載の画像処理装置。 The recognizer has a learned model learned by setting the plurality of images for learning and the correct answer data, and each time the plurality of images for recognition are received, the recognition result is based on the learned model. The image processing device according to claim 1, wherein the image processing device outputs the image data.
  3.  前記学習済みモデルは、畳み込みニューラルネットワークで構成される請求項2に記載の画像処理装置。 The image processing device according to claim 2, wherein the learned model is configured by a convolutional neural network.
  4.  前記複数の画像は、第1内視鏡画像及び前記第1内視鏡画像とは異なる観察光を用いて取得された第2内視鏡画像を含む請求項1から3のいずれか1項に記載の画像処理装置。 The plurality of images according to any one of claims 1 to 3, wherein the plurality of images include a first endoscopic image and a second endoscopic image acquired using observation light different from the first endoscopic image. The image processing apparatus according to claim 1.
  5.  前記第1内視鏡画像は、通常光で撮像された通常光画像であり、前記第2内視鏡画像は、特殊光で撮像された特殊光画像である請求項4に記載の画像処理装置。 The image processing device according to claim 4, wherein the first endoscope image is a normal light image captured by normal light, and the second endoscope image is a special light image captured by special light. .
  6.  前記特殊光画像は、2以上の異なる特殊光により撮像された2以上の特殊光画像を含む請求項5に記載の画像処理装置。 The image processing apparatus according to claim 5, wherein the special light image includes two or more special light images captured by two or more different special lights.
  7.  前記第1内視鏡画像は、第1特殊光で撮像された第1特殊光画像であり、前記第2内視鏡画像は、前記第1特殊光とは異なる第2特殊光で撮像された第2特殊光画像である請求項4に記載の画像処理装置。 The first endoscope image is a first special light image captured with a first special light, and the second endoscope image is captured with a second special light different from the first special light. The image processing device according to claim 4, wherein the image is a second special light image.
  8.  前記表示制御部は、前記複数の画像の一部もしくは前記複数の画像を用いて算出された観察用画像を、動画として前記表示部に表示させる請求項1から6のいずれか1項に記載の画像処理装置。 The display control unit according to any one of claims 1 to 6, wherein the display control unit causes the display unit to display a part of the plurality of images or an observation image calculated using the plurality of images as a moving image. Image processing device.
  9.  前記認識器は、前記複数の画像に含まれる注目領域を認識し、
     前記表示制御部は、前記認識された前記注目領域を示す指標を、前記表示部に表示された画像上に重畳して表示させる請求項1から8のいずれか1項に記載の画像処理装置。
    The recognizer recognizes a region of interest included in the plurality of images,
    9. The image processing apparatus according to claim 1, wherein the display control unit displays an index indicating the recognized area of interest in a superimposed manner on an image displayed on the display unit. 10.
  10.  前記認識器は、前記複数の画像に含まれる注目領域を認識し、
     前記表示制御部は、前記注目領域の有無を表す情報を前記表示部に表示された画像と重ならないように表示させる請求項1から8のいずれか1項に記載の画像処理装置。
    The recognizer recognizes a region of interest included in the plurality of images,
    The image processing apparatus according to claim 1, wherein the display control unit displays information indicating the presence or absence of the attention area so as not to overlap an image displayed on the display unit.
  11.  前記認識器は、前記複数の画像に基づいて病変に関する鑑別を実行して鑑別結果を出力し、
     前記表示制御部は、前記鑑別結果を前記表示部に表示させる請求項1から10のいずれか1項に記載の画像処理装置。
    The recognizer performs a discrimination regarding a lesion based on the plurality of images and outputs a discrimination result,
    The image processing apparatus according to claim 1, wherein the display control unit causes the display unit to display the discrimination result.
  12.  第1観察光と前記第1観察光と異なる第2観察光とを順次発生する光源装置と、
     前記第1観察光と前記第2観察光により順次照明された観察対象を順次撮像することにより前記複数の画像を撮像する内視鏡スコープと、
     前記表示部と、
     請求項1から11のいずれか1項に記載の画像処理装置と、を備え、
     前記認識器は、前記内視鏡スコープが撮像する前記複数の画像からなる前記画像セットを受け付ける内視鏡システム。
    A light source device for sequentially generating a first observation light and a second observation light different from the first observation light,
    An endoscope scope for capturing the plurality of images by sequentially capturing images of the observation target sequentially illuminated by the first observation light and the second observation light;
    The display unit;
    An image processing apparatus according to any one of claims 1 to 11, comprising:
    The endoscope system, wherein the recognizer receives the image set including the plurality of images captured by the endoscope.
  13.  前記内視鏡スコープが撮像する前記複数の画像を受け付け、前記複数の画像の画像処理を行う内視鏡プロセッサを備え、
     前記認識器は、前記内視鏡プロセッサによる画像処理後の前記複数の画像を受け付ける請求項12に記載の内視鏡システム。
    An endoscope processor that receives the plurality of images captured by the endoscope scope and performs image processing on the plurality of images,
    The endoscope system according to claim 12, wherein the recognizer accepts the plurality of images after the endoscope processor processes the images.
  14.  複数の異なる観察光を用いて取得された複数の画像からなる画像セットを受け付ける第1ステップと、
     認識器が、前記画像セットに対する認識結果を出力する第2ステップと、
     表示制御部が、前記複数の画像の一部もしくは前記複数の画像を用いて算出された観察用画像及び前記認識結果を表示部に表示させる第3ステップと、を含み、
     前記第1ステップから前記第3ステップの処理を繰り返し実行する画像処理方法。
    A first step of receiving an image set including a plurality of images acquired using a plurality of different observation lights;
    A second step in which a recognizer outputs a recognition result for the image set;
    A display control unit includes a third step of displaying a part of the plurality of images or an observation image calculated using the plurality of images and the recognition result on a display unit,
    An image processing method for repeatedly executing the processing from the first step to the third step.
  15.  前記第2ステップは、学習用の前記画像セットと正解データとにより学習した学習済みモデルを有する前記認識器が、認識用の前記画像セットを受け付ける毎に前記学習済みモデルに基づいて前記認識結果を出力する請求項14に記載の画像処理方法。 In the second step, the recognizer having a trained model trained by the image set for learning and the correct answer data, the recognizer based on the trained model each time the image set for recognition is received. The image processing method according to claim 14, wherein the output is performed.
  16.  前記学習済みモデルは、畳み込みニューラルネットワークで構成される請求項15に記載の画像処理方法。 The image processing method according to claim 15, wherein the learned model is configured by a convolutional neural network.
  17.  前記複数の画像は、第1内視鏡画像及び前記第1内視鏡画像とは異なる観察光を用いて取得された第2内視鏡画像を含む請求項14から16のいずれか1項に記載の画像処理方法。 The image according to any one of claims 14 to 16, wherein the plurality of images include a first endoscope image and a second endoscope image acquired using observation light different from the first endoscope image. The image processing method described in the above.
  18.  前記第1内視鏡画像は、通常光で撮像された通常光画像であり、前記第2内視鏡画像は、特殊光で撮像された特殊光画像である請求項17に記載の画像処理方法。 18. The image processing method according to claim 17, wherein the first endoscope image is a normal light image captured with normal light, and the second endoscope image is a special light image captured with special light. .
PCT/JP2019/023492 2018-07-05 2019-06-13 Image processing device, method, and endoscopic system WO2020008834A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2020528760A JP7289296B2 (en) 2018-07-05 2019-06-13 Image processing device, endoscope system, and method of operating image processing device
JP2022168121A JP2022189900A (en) 2018-07-05 2022-10-20 Image processing device, endoscope system, and operation method of image processing device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018-128168 2018-07-05
JP2018128168 2018-07-05

Publications (1)

Publication Number Publication Date
WO2020008834A1 true WO2020008834A1 (en) 2020-01-09

Family

ID=69059546

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/023492 WO2020008834A1 (en) 2018-07-05 2019-06-13 Image processing device, method, and endoscopic system

Country Status (2)

Country Link
JP (2) JP7289296B2 (en)
WO (1) WO2020008834A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021125056A (en) * 2020-02-07 2021-08-30 カシオ計算機株式会社 Identification device, identification equipment learning method, identification method, and program
JP2022038390A (en) * 2020-08-26 2022-03-10 株式会社東芝 Inference device, method, program, and learning device
WO2023281607A1 (en) * 2021-07-05 2023-01-12 オリンパスメディカルシステムズ株式会社 Endoscope processor, endoscope device, and method of generating diagnostic image
WO2023007896A1 (en) * 2021-07-28 2023-02-02 富士フイルム株式会社 Endoscope system, processor device, and operation method therefor
WO2023026538A1 (en) * 2021-08-27 2023-03-02 ソニーグループ株式会社 Medical assistance system, medical assistance method, and evaluation assistance device
JP7411515B2 (en) 2020-07-16 2024-01-11 富士フイルム株式会社 Endoscope system and its operating method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017175282A1 (en) * 2016-04-04 2017-10-12 オリンパス株式会社 Learning method, image recognition device, and program
WO2019088121A1 (en) * 2017-10-30 2019-05-09 公益財団法人がん研究会 Image diagnosis assistance apparatus, data collection method, image diagnosis assistance method, and image diagnosis assistance program

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5308815B2 (en) * 2006-04-20 2013-10-09 オリンパスメディカルシステムズ株式会社 Biological observation system
JP6140056B2 (en) * 2013-09-26 2017-05-31 富士フイルム株式会社 Endoscope system, processor device for endoscope system, method for operating endoscope system, method for operating processor device
JP6602969B2 (en) * 2016-05-23 2019-11-06 オリンパス株式会社 Endoscopic image processing device
US10803582B2 (en) * 2016-07-04 2020-10-13 Nec Corporation Image diagnosis learning device, image diagnosis device, image diagnosis method, and recording medium for storing program
CN110049709B (en) * 2016-12-07 2022-01-11 奥林巴斯株式会社 Image processing apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017175282A1 (en) * 2016-04-04 2017-10-12 オリンパス株式会社 Learning method, image recognition device, and program
WO2019088121A1 (en) * 2017-10-30 2019-05-09 公益財団法人がん研究会 Image diagnosis assistance apparatus, data collection method, image diagnosis assistance method, and image diagnosis assistance program

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IWAHORI, YUJI ET AL.: "Classification and size & shape recovery from endscope image for supporting medical diagnosis", SOGO KOGAKU, vol. 30, 31 March 2018 (2018-03-31), pages 18 - 36, ISSN: 0915-3292 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021125056A (en) * 2020-02-07 2021-08-30 カシオ計算機株式会社 Identification device, identification equipment learning method, identification method, and program
JP7044120B2 (en) 2020-02-07 2022-03-30 カシオ計算機株式会社 Discriminator, discriminator learning method, discriminating method and program
US11295443B2 (en) 2020-02-07 2022-04-05 Casio Computer Co., Ltd. Identification apparatus, identifier training method, identification method, and recording medium
JP7411515B2 (en) 2020-07-16 2024-01-11 富士フイルム株式会社 Endoscope system and its operating method
JP2022038390A (en) * 2020-08-26 2022-03-10 株式会社東芝 Inference device, method, program, and learning device
WO2023281607A1 (en) * 2021-07-05 2023-01-12 オリンパスメディカルシステムズ株式会社 Endoscope processor, endoscope device, and method of generating diagnostic image
WO2023007896A1 (en) * 2021-07-28 2023-02-02 富士フイルム株式会社 Endoscope system, processor device, and operation method therefor
WO2023026538A1 (en) * 2021-08-27 2023-03-02 ソニーグループ株式会社 Medical assistance system, medical assistance method, and evaluation assistance device

Also Published As

Publication number Publication date
JP2022189900A (en) 2022-12-22
JPWO2020008834A1 (en) 2021-06-24
JP7289296B2 (en) 2023-06-09

Similar Documents

Publication Publication Date Title
WO2020008834A1 (en) Image processing device, method, and endoscopic system
JP7346285B2 (en) Medical image processing device, endoscope system, operating method and program for medical image processing device
JP7135082B2 (en) Endoscope device, method of operating endoscope device, and program
JPWO2018159363A1 (en) Endoscope system and operation method thereof
US20210343011A1 (en) Medical image processing apparatus, endoscope system, and medical image processing method
US11948080B2 (en) Image processing method and image processing apparatus
JP7015385B2 (en) Endoscopic image processing device, operation method of endoscopic device, and program
JP7335399B2 (en) MEDICAL IMAGE PROCESSING APPARATUS, ENDOSCOPE SYSTEM, AND METHOD OF OPERATION OF MEDICAL IMAGE PROCESSING APPARATUS
JP7374280B2 (en) Endoscope device, endoscope processor, and method of operating the endoscope device
JP2021086350A (en) Image learning device, image learning method, neural network, and image classification device
WO2020170809A1 (en) Medical image processing device, endoscope system, and medical image processing method
JP7387859B2 (en) Medical image processing device, processor device, endoscope system, operating method and program for medical image processing device
WO2021199910A1 (en) Medical image processing system and method for operating medical image processing system
US20230389774A1 (en) Medical image processing apparatus, endoscope system, medical image processing method, and medical image processing program
WO2023007896A1 (en) Endoscope system, processor device, and operation method therefor
WO2021153471A1 (en) Medical image processing device, medical image processing method, and program
US20240013392A1 (en) Processor device, medical image processing device, medical image processing system, and endoscope system
WO2019202982A1 (en) Endoscope device, endoscope operating method, and program
CN114627045A (en) Medical image processing system and method for operating medical image processing system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19830637

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2020528760

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19830637

Country of ref document: EP

Kind code of ref document: A1