WO2020075719A1

WO2020075719A1 - Image processing device, image processing method, and program

Info

Publication number: WO2020075719A1
Application number: PCT/JP2019/039676
Authority: WO
Inventors: 櫛田　晃弘; 律也富田
Original assignee: キヤノン株式会社
Priority date: 2018-10-10
Filing date: 2019-10-08
Publication date: 2020-04-16

Abstract

Provided is an image processing device provided with: an acquisition unit that acquires a first tomographic image of a subject; and an image quality improving unit that generates, from the first tomographic image by use of a learned model, a second tomographic image such that different image processes have been performed on different areas of the first tomographic image.

Description

Image processing apparatus, image processing method, and program

The present invention relates to an image processing device, an image processing method, and a program.

Currently, various types of ophthalmic equipment using optical equipment are used. For example, various devices such as an anterior segment imaging device, a fundus camera, and a confocal laser scanning ophthalmoscope (SLO: Scanning Laser Ophthalmoscope) are used as optical devices for observing an eye.

Above all, an optical coherence tomography apparatus (OCT apparatus) using optical coherence tomography (OCT) utilizing multi-wavelength light wave interference is an apparatus capable of obtaining a tomographic image of a sample with high resolution. For this reason, the OCT device is becoming an indispensable device in the outpatient specialized in the retina as an ophthalmic device. Further, the OCT device is used not only for ophthalmology but also for endoscopes and the like. The OCT apparatus is widely used in ophthalmologic diagnosis and the like to acquire a tomographic image of the retina of the fundus of the eye to be inspected and an anterior ocular segment such as the cornea.

Original data of a tomographic image captured by an OCT apparatus is generally in a floating point format of about 32 bits or an integer format of 10 bits or more, and has a high dynamic range data including very low brightness information to high brightness information. Is. On the other hand, data that can be displayed on a normal display is, for example, 8-bit integer format data, which is data having a relatively low dynamic range. Therefore, if the original data having a high dynamic range is directly converted into the data having a low dynamic range for display, the contrast of the retina, which is important for the diagnosis of the fundus, is significantly reduced.

Therefore, in a general OCT device, when converting the original data of the tomographic image into the data for display, the low-luminance side data is discarded to some extent to obtain good contrast in the retina. In this case, in the displayed tomographic image, the contrast of the region related to the vitreous body, choroid, etc., which is shown as a low-luminance region, decreases, and it becomes difficult to observe the internal structure of the vitreous and choroid.

On the other hand, in order to observe the internal structure of the vitreous part and choroid part in more detail, if the original data of the tomographic image is gradation-converted so as to ensure the contrast of the regions related to the vitreous part and the choroid part, high brightness The contrast of the retina area is reduced, making it difficult to observe the retina area.

In recent years, there is a need to make a global observation as well as a local observation of the eye to be inspected. Regarding such needs, Patent Document 1 proposes a method of segmenting a tomographic image, setting display conditions for each specified partial region, and performing gradation conversion processing.

International Publication No. 2014/203901

In diseased eyes, the shape of the retina becomes irregular due to the disappearance of layers, bleeding, and the formation of white spots and new blood vessels. Therefore, in the conventional segmentation processing method that determines the result of image feature extraction by utilizing the regularity of the shape of the retina and detects the boundary of the retinal layer, erroneous detection is performed when the boundary detection of the retinal layer is performed automatically. There was a limit that such things occur. In this case, due to erroneous detection or the like in the segmentation process, it may not be possible to appropriately perform the gradation conversion process or the like for each partial region (observation target) for performing a global observation of the eye to be inspected.

Therefore, one of the objects of the present invention is to provide an image processing apparatus, an image processing method, and a program that can generate an image in which appropriate image processing has been performed for each region to be observed.

An image processing apparatus according to an embodiment of the present invention uses an acquisition unit that acquires a first medical image of a subject and a learned model to convert the first medical image from the first medical image into the first medical image. An image quality improving unit that generates a second medical image in which different regions are subjected to different image processing.

An image processing method according to another embodiment of the present invention uses a step of acquiring a first medical image of a subject and a trained model to convert the first medical image from the first medical image into the first medical image. Generating a second medical image in which different regions are subjected to different image processing.

Further features of the invention will be apparent from the following description of an exemplary embodiment with reference to the accompanying drawings.

1 shows a schematic configuration example of an OCT apparatus according to a first embodiment. 1 illustrates a schematic configuration example of an imaging unit according to a first embodiment. 1 shows a schematic configuration example of a control unit according to the first embodiment. It is explanatory drawing of the segmentation of a retina part, a vitreous part, and a choroid part. It is an explanatory view of general display image processing. It is an explanatory view of general display image processing. It is an explanatory view of general display image processing. It is an explanatory view of general display image processing. It is explanatory drawing of the conversion process which makes it easy to observe a retina part. It is explanatory drawing of the conversion process which makes it easy to observe a retina part. It is explanatory drawing of the conversion process which makes it easy to observe a vitreous part and a choroid part. It is explanatory drawing of the conversion process which makes it easy to observe a vitreous part and a choroid part. An example of learning data is shown. An example of learning data is shown. An example of learning data is shown. An example of learning data is shown. The structural example of the learned model is shown. 6 is a flowchart of a series of image processing according to the first embodiment. Another example of learning data is shown. Another example of learning data is shown. Another example of learning data is shown. It is explanatory drawing of imaging | photography in vitreous body mode. It is explanatory drawing of imaging | photography in vitreous body mode. It is explanatory drawing of imaging | photography in vitreous body mode. It is explanatory drawing of imaging | photography in choroid mode. It is explanatory drawing of imaging | photography in choroid mode. It is explanatory drawing of imaging | photography in choroid mode. Another example of learning data is shown. Another example of learning data is shown. Another example of learning data is shown. 3 shows a schematic configuration example of a control unit according to a second embodiment. 9 is a flowchart of a series of image processing according to the second embodiment. An example of a display screen for selecting an area to be noted is shown. An example of a display screen for selecting an area to be noted is shown. An example of a display screen for selecting an area to be noted is shown. The schematic structural example of the control part which concerns on Example 3 is shown. 9 is a flowchart of a series of image processing according to the third embodiment. An example of En-Face images of a plurality of OCTAs is shown. An example of a plurality of tomographic images is shown. 14 shows an example of a user interface according to a fourth embodiment. 14 shows an example of a user interface according to a fourth embodiment. 14 shows an example of a user interface according to a fourth embodiment. An example of the configuration of a neural network used as a machine learning model according to Modification 13 is shown. An example of the configuration of a neural network used as a machine learning model according to Modification 13 is shown. An example of the configuration of a neural network used as a machine learning model according to Modification 13 is shown. An example of the configuration of a neural network used as a machine learning model according to Modification 13 is shown.

Hereinafter, exemplary embodiments for carrying out the present invention will be described in detail with reference to the drawings. However, dimensions, materials, shapes, relative positions of components and the like described in the following embodiments are arbitrary, and can be changed according to the configuration of the apparatus to which the present invention is applied or various conditions. Also, in the drawings, the same reference numerals are used between the drawings to indicate the same or functionally similar elements.

Note that in the following, the machine learning model means a learning model based on a machine learning algorithm. Specific algorithms for machine learning include nearest neighbor method, naive Bayes method, decision tree, and support vector machine. Further, there is also a deep learning in which a feature amount for learning and a connection weighting coefficient are generated by themselves using a neural network. Appropriately applicable ones of the above algorithms can be applied to the following embodiments and modifications. Further, the teacher data refers to learning data, and is composed of a pair of input data and output data. The correct answer data is output data of learning data (teacher data).

Note that the learned model is a model obtained by training (learning) a machine learning model according to an arbitrary machine learning algorithm such as deep learning in advance using appropriate teacher data (learning data). . However, although the learned model has been obtained in advance by using appropriate learning data, it is assumed that additional learning can be performed instead of not performing further learning. The additional learning can also be performed after the device has been installed at the point of use.

(Example 1)
An OCT apparatus according to the first embodiment will be described with reference to FIGS. 1 to 13C. FIG. 1 shows a schematic configuration example of the OCT apparatus according to this embodiment.

(Body configuration)
The OCT device 1 is provided with an imaging unit 20, a control unit 30 (image processing device), an input unit 40, and a display unit 50. The imaging unit 20 is provided with a measurement optical system 21, a stage unit 25, and a base unit 23. The measurement optical system 21 can acquire an anterior segment image, an SLO fundus image of the subject's eye, and a tomographic image. The measurement optical system 21 is provided on the base portion 23 via the stage portion 25. The stage unit 25 supports the measurement optical system 21 so as to be movable back and forth and left and right. The base unit 23 is provided with a spectroscope described later.

The control unit 30 is connected to the photographing unit 20 and the display unit 50 and can control them. The control unit 30 can also generate a tomographic image and perform image processing based on the tomographic information acquired from the imaging unit 20 and the like. The control unit 30 may be connected to any other device (not shown) via any network such as the Internet.

An input unit 40 is connected to the control unit 30. The input unit 40 is operated by an operator (inspector) and is used to input an instruction to the control unit 30. The input unit 40 may include any input means, and may include, for example, a keyboard and a mouse. The display unit 50 is configured by an arbitrary display, and can display the information of the subject, various images, and the like under the control of the control unit 30.

(Structure of shooting unit)
Next, the configuration of the image capturing unit 20 will be described with reference to FIG. FIG. 2 shows a schematic configuration example of the imaging unit 20 according to the present embodiment.

First, the configuration of the measurement optical system 21 will be described. In the measurement optical system 21, the objective lens 201 is arranged so as to face the eye E to be inspected, and the first dichroic mirror 202 and the second dichroic mirror 203 are arranged on the optical axis thereof. With these dichroic mirrors, the optical path from the objective lens 201 is the optical path L1 of the OCT optical system, the SLO optical system for observing the eye E and acquiring the SLO fundus image, and the optical path L2 for the fixation lamp, and the anterior eye. Each wavelength band is branched to the observation optical path L3.

In this embodiment, an optical path L3 for anterior ocular segment observation is provided in the reflection direction of the first dichroic mirror 202, and an optical path L1 for the OCT optical system and an optical path L2 for the SLO optical system and the fixation lamp are provided in the transmission direction. It is provided. The optical path L1 of the OCT optical system is provided in the reflection direction of the second dichroic mirror 203, and the SLO optical system and the optical path L2 for the fixation lamp are provided in the transmission direction. However, the direction in which the optical path of each optical system is provided is not limited to this, and may be arbitrarily changed according to the desired configuration.

An SLO scanning unit 204,

lenses

205 and 206, a mirror 207, a third dichroic mirror 208, a photodiode 209, an SLO light source 210, and a fixation lamp 211 are provided in the optical path L2 for the SLO optical system and the fixation lamp. There is. In this embodiment, the SLO light source 210 is provided in the reflection direction of the third dichroic mirror 208, and the fixation lamp 211 is provided in the transmission direction. However, the fixation lamp 211 may be provided in the reflection direction of the third dichroic mirror 208 and the SLO light source 210 may be provided in the transmission direction.

The SLO scanning unit 204 is a scanning unit that scans the light emitted from the SLO light source 210 and the fixation lamp 211 on the eye E, and includes an X scanner that scans in the X-axis direction and a Y scanner that scans in the Y-axis direction. Including. In this embodiment, the X scanner is required to perform high-speed scanning, so that it is composed of a polygon mirror and the Y scanner is composed of a galvanometer mirror. The configuration of the SLO scanning unit 204 is not limited to this, and may be arbitrarily changed according to the desired configuration.

The lens 205 can be driven in the optical axis direction indicated by the arrow in the figure by a motor or the like (not shown) controlled by the control unit 30 for focusing the SLO optical system and the fixation lamp. The mirror 207 is a prism in which a perforated mirror or a hollow mirror is vapor-deposited, and can separate the projection light from the SLO light source 210 and the return light from the eye E to be inspected. The third dichroic mirror 208 separates the optical path to the SLO light source 210 and the optical path to the fixation lamp 211 for each wavelength band.

The SLO light source 210 generates light with a wavelength near 780 nm, for example. The photodiode 209 detects the return light from the subject's eye E in the projection light emitted from the SLO light source 210. The fixation lamp 211 is used to generate visible light and promote the fixation of the subject.

The projection light emitted from the SLO light source 210 is reflected by the third dichroic mirror 208, passes through the mirror 207, passes through the

lenses

206 and 205, and is scanned on the eye E by the SLO scanning unit 204. The return light from the eye E to be examined returns through the same path as the projection light, is reflected by the mirror 207, and is guided to the photodiode 209. The control unit 30 can generate an SLO fundus image based on the drive position of the SLO scanning unit 204 and the output from the photodiode 209.

The light emitted from the fixation lamp 211 passes through the third dichroic mirror 208 and the mirror 207, passes through the

lenses

206 and 205, and is scanned on the eye E by the SLO scanning unit 204. At this time, the control unit 30 blinks the fixation lamp 211 in accordance with the movement of the SLO scanning unit 204 to create an arbitrary shape on the eye E to be inspected, thereby promoting the fixation of the subject. be able to.

The

lenses

212 and 213, the split prism 214, and the CCD 215 for observing the anterior segment that detects infrared light are arranged in the optical path L3 for observing the anterior eye. The CCD 215 has a sensitivity around the wavelength of irradiation light for anterior ocular segment observation (not shown), specifically, around 970 nm. The split prism 214 is arranged at a position conjugate with the pupil of the eye E to be inspected. The control unit 30 can generate an anterior segment image based on the output of the CCD 215. The control unit 30 can detect the distance in the Z-axis direction (front-back direction) of the measurement optical system 21 with respect to the eye E by using the split image of the anterior segment based on the light that has passed through the split prism 214.

The optical path L1 of the OCT optical system is provided with an OCT optical system for capturing a tomographic image of the eye E to be inspected. More specifically, the OCT optical system is used to obtain an interference signal for generating a tomographic image of the eye E to be inspected.

An XY scanner 216,

lenses

217 and 218, and a fiber end of an optical fiber 224 are provided in the optical path L1 of the OCT optical system. The XY scanner 216 is an OCT scanning unit for scanning the measurement light E described below on the eye E. Although the XY scanner 216 is illustrated as a single mirror, it is composed of two galvanometer mirrors for scanning the measurement light in the biaxial directions of the X-axis direction and the Y-axis direction. The configuration of the XY scanner 216 is not limited to this, and may be arbitrarily changed according to the desired configuration. For example, the XY scanner 216 may be configured by a MEMS mirror or the like that can deflect light in a two-dimensional direction with one sheet.

The lens 217 can be driven in the optical axis direction indicated by the arrow in the figure by a motor or the like (not shown) controlled by the control unit 30. The control unit 30 can focus the measurement light emitted from the optical fiber 224 connected to the optical coupler 219 on the eye E by driving the lens 217 by a motor (not shown) or the like. Due to this focusing, the return light of the measurement light from the eye E is simultaneously imaged and incident on the tip of the optical fiber 224 in a spot shape.

Next, the configurations of the optical path from the OCT light source 220, the reference optical system, and the spectroscope 230 will be described. The OCT light source 220 is connected to the optical coupler 219 via the optical fiber 225.

Optical fibers

224, 225, 226 and 227 are connected to the optical coupler 219. The

optical fibers

224, 225, 226 and 227 are single mode optical fibers connected to and integrated with the optical coupler 219.

The fiber end of the optical fiber 224 is arranged on the OCT optical path L1, and the measuring light enters the OCT optical path L1 through the optical fiber 224 and the polarization adjusting unit 228 provided on the optical fiber 224 on the measuring light side. On the other hand, the fiber end of the optical fiber 226 is disposed in the optical path of the reference optical system, and the reference light described later enters the optical path of the reference optical system through the optical fiber 226 and the polarization adjusting unit 229 on the reference light side provided in the optical fiber 226. To do. A lens 223, a dispersion compensation glass 222, and a reference mirror 221 are provided in the optical path of the reference optical system. Further, the optical fiber 227 is connected to the spectroscope 230.

According to these configurations, Michelson interference system is configured. In this embodiment, the Michelson interference system is used as the interference system, but a Mach-Zehnder interference system may be used. Depending on the light quantity difference between the measurement light and the reference light, a Mach-Zehnder interference system can be used when the light quantity difference is large, and a Michelson interference system can be used when the light quantity difference is relatively small.

The OCT light source 220 emits light used for measurement by OCT. In this embodiment, an SLD (Super Luminescent Diode), which is a typical low-coherent light source, is used as the OCT light source 220. The center wavelength of the SLD in this example was 855 nm, and the wavelength band width was about 100 nm. Here, the bandwidth is an important parameter because it affects the resolution of the obtained tomographic image in the optical axis direction. Although SLD is selected here as the type of light source, ASE (Amplified Spontaneous Emission) or the like may be used as long as low-coherent light can be emitted. The center wavelength may be near-infrared light in view of photographing the eye. Further, since the central wavelength affects the lateral resolution of the tomographic image obtained, the wavelength can be as short as possible. In this embodiment, the central wavelength is set to 855 nm for both reasons.

The light emitted from the OCT light source 220 enters the optical coupler 219 through the optical fiber 225. The light incident on the optical coupler 219 is split via the optical coupler 219 into measurement light traveling toward the optical fiber 224 side and reference light traveling toward the optical fiber 226 side. The measurement light is applied to the subject's eye E, which is the subject, through the optical path L1 of the OCT optical system described above. The return light of the measurement light due to the reflection or scattering of the eye E to be examined reaches the optical coupler 219 through the same optical path.

On the other hand, the reference light reaches and is reflected by the reference mirror 221 via the optical fiber 226, the lens 223, and the dispersion compensation glass 222 inserted to match the dispersion of the measurement light and the reference light. After that, the reference light returns through the same optical path and reaches the optical coupler 219. Here, the reference mirror 221 is held by a motor or the like (not shown) controlled by the control unit 30 so as to be adjustable in the optical axis direction indicated by the arrow in the figure.

In the optical coupler 219, the measurement light and the reference light are combined into interference light. Here, the measurement light and the reference light cause interference when the optical path length of the measurement light and the optical path length of the reference light become substantially the same. The control unit 30 controls a motor (not shown) or the like to move the reference mirror 221 in the optical axis direction, so that the optical path length of the reference light can be matched with the optical path length of the measurement light that changes depending on the eye E to be inspected.

The polarization adjusting unit 228 on the measurement light side and the polarization adjusting unit 229 on the reference light side have some portions in which the optical fiber is looped. The

polarization adjusting units

228 and 229 adjust the polarization states of the measurement light and the reference light by adjusting the polarization states of the measurement light and the reference light by rotating the loop-shaped portion about the longitudinal direction of the optical fiber and twisting the fiber. You can

The interference light generated in the optical coupler 219 is guided to the spectroscope 230 provided in the base section 23 via the optical fiber 227. The spectroscope 230 is provided with

lenses

234 and 232, a diffraction grating 233, and a line sensor 231. The interference light emitted from the optical fiber 227 becomes parallel light through the lens 234, is then dispersed by the diffraction grating 233, and is imaged on the line sensor 231 by the lens 232. The control unit 30 can generate a tomographic image of the eye E by using the interference signal based on the interference light, which is output from the line sensor 231.

With the above-described configuration, by using the imaging unit 20, a tomographic image of the eye E to be inspected can be acquired, and an SLO fundus image of the eye E to be inspected with high contrast even with near infrared light can be acquired. can do.

(How to take a tomographic image)
Next, a method of capturing a tomographic image using the OCT apparatus 1 will be described. In the OCT apparatus 1, the control unit 30 controls the XY scanner 216 to capture a tomographic image of a predetermined portion of the eye E to be inspected. Here, the locus along which the measurement light is scanned on the eye E is referred to as a scan pattern (scan pattern). This scan pattern includes, for example, a cross scan in which a single point is scanned in a vertical and horizontal cross shape, and a 3D scan in which the entire area is scanned to obtain a three-dimensional tomographic image as a result. Cross-scan is suitable for detailed observation of a specific region, and 3D scan is suitable for observing the layer structure and layer thickness of the entire retina.

Here, we will explain the shooting method when 3D scanning is performed. First, the measurement light is scanned (scanned) in the X-axis direction (main scanning direction) in the figure, and the line sensor 231 acquires information about a predetermined number of images from the imaging range of the eye E to be examined in the X-axis direction.

Here, acquiring the tomographic information in the depth direction at one point in the X-axis direction of the eye E to be examined is called A scan. The luminance distribution on the line sensor 231 obtained by the A scan is subjected to fast Fourier transform (FFT: Fast Fourier Transform), and the linear luminance distribution obtained by the FFT is converted into density information for display on the display unit 50. As a result, an A-scan image based on the information acquired by the A-scan can be generated. Further, by arranging a plurality of A-scan images, a B-scan image that is a two-dimensional image can be acquired.

After capturing a plurality of A-scan images for forming one B-scan image, the scan position in the Y-axis direction (sub-scanning direction) is moved, and scanning in the X-axis direction is performed again. Images can be acquired. By displaying a plurality of B-scan images or a three-dimensional tomographic image constructed from the plurality of B-scan images on the display unit 50, the examiner can observe the three-dimensional tomographic state of the eye E to be examined. The examiner can diagnose the eye E to be inspected based on the image. Here, an example in which a three-dimensional tomographic image is acquired by obtaining a plurality of B-scan images in the X-axis direction has been shown, but a three-dimensional tomographic image may be obtained by obtaining a plurality of B-scan images in the Y-axis direction. . The scanning direction is not limited to the X-axis direction and the Y-axis direction, and may be any axial direction orthogonal to the Z-axis direction and intersecting each other.

(Configuration of control unit)
Next, the control unit 30 will be described with reference to FIG. FIG. 3 shows a schematic configuration example of the control unit 30. The control unit 30 is provided with an acquisition unit 310, an image processing unit 320, a drive control unit 330, a storage unit 340, and a display control unit 350.

The acquisition unit 310 can acquire the output signals of the CCD 215 and the photodiode 209 and the output signal data of the line sensor 231 corresponding to the interference signal of the eye E from the imaging unit 20. The data of the output signal acquired by the acquisition unit 310 may be an analog signal or a digital signal. When the acquisition unit 310 acquires an analog signal, the control unit 30 can convert the analog signal into a digital signal.

Further, the acquisition unit 310 can acquire various data such as tomographic data generated by the image processing unit 320 and various images such as a tomographic image, an SLO fundus image, and an anterior segment image. Here, the tomographic data is data including information on a tomographic image of a subject, and includes data including a signal obtained by performing Fourier transform on an interference signal by OCT, a signal obtained by performing arbitrary processing on the signal, and the like.

Further, the acquisition unit 310 includes a shooting condition group of images to be image-processed (for example, shooting date / time, shooting region name, shooting region, shooting angle of view, shooting method, image resolution and gradation, image size of image, image filter). , And information about the image data format). The shooting condition group is not limited to the illustrated one. Further, the shooting condition group does not need to include all of the exemplified ones, and may include some of them.

Specifically, the acquisition unit 310 acquires the photographing conditions of the photographing unit 20 when the image is photographed. In addition, the acquisition unit 310 can also acquire the shooting condition group stored in the data structure forming the image according to the data format of the image. In addition, when the shooting condition is not stored in the data structure of the image, the acquisition unit 310 can also acquire the shooting information group including the shooting condition group from a storage device or the like separately storing the shooting condition.

The acquisition unit 310 can also acquire information for identifying the eye to be inspected, such as the subject identification number, from the input unit 40 or the like. The acquisition unit 310 may acquire various data, various images, and various information from the storage unit 340 and other devices (not shown) connected to the control unit 30. The acquisition unit 310 can store various acquired data and images in the storage unit 340.

The image processing unit 320 can generate a tomographic image from the data acquired by the acquisition unit 310 or the data stored in the storage unit 340, and can perform image processing on the generated or acquired tomographic image. The image processing unit 320 is provided with a tomographic image generation unit 321 and an image quality improvement unit 322.

The tomographic image generation unit 321 generates tomographic data by performing wave number conversion, Fourier transform, absolute value conversion (acquisition of amplitude) or the like on the data of the interference signal acquired by the acquisition unit 310, and based on the tomographic data, the tomographic image is generated. A tomographic image of the optometry E can be generated. Here, the data of the interference signal acquired by the acquisition unit 310 may be the data of the signal output from the line sensor 231, or may be acquired from a device (not shown) connected to the storage unit 340 or the control unit 30. It may be the data of the generated interference signal. Any known method may be adopted as the method of generating the tomographic image, and detailed description thereof will be omitted.

The image quality improving unit 322 generates a high quality tomographic image from the tomographic image generated by the tomographic image generating unit 321 using a learned model described later. The image quality improving unit 322 acquires not only the tomographic image captured by the image capturing unit 20 but also the acquisition unit 310 from the storage unit 340 and other devices (not shown) connected to the control unit 30. It is also possible to generate a high-quality tomographic image based on the tomographic image.

The drive control unit 330 includes the OCT light source 220 of the imaging unit 20 connected to the control unit 30, the XY scanner 216, the lens 217, the reference mirror 221, the SLO light source 210, the SLO scanning unit 204, the lens 205, and the fixation lamp 211. It is possible to control driving of components such as.

The storage unit 340 can store various data acquired by the acquisition unit 310 and various images and data such as tomographic images generated and processed by the image processing unit 320. In addition, the storage unit 340 stores information about the subject's eye such as the subject's attributes (name, age, etc.) and measurement results (e.g., axial length and intraocular pressure) acquired using other test equipment, imaging parameters, and images. The analysis parameter and the parameter set by the operator can be stored. Further, the storage unit 340 can also store the statistical information of the normal database. Note that these images and information may be stored in an external storage device (not shown). The storage unit 340 can also store a program or the like for executing the functions of the respective components of the control unit 30 by being executed by the processor.

The display control unit 350 can cause the display unit 50 to display various images acquired by the acquisition unit 310 and various images such as tomographic images generated and processed by the image processing unit 320. Further, the display control unit 350 can cause the display unit 50 to display information and the like input by the user.

The control unit 30 may be configured using a general-purpose computer, for example. The control unit 30 may be configured using a dedicated computer for the OCT apparatus 1. The control unit 30 includes a storage medium including a CPU (Central Processing Unit), an MPU (Micro Processing Unit) (not shown), and a memory such as an optical disk and a ROM (Read Only Memory). Each component other than the storage unit 340 of the control unit 30 may be configured by a software module executed by a processor such as a CPU or MPU. Further, each component may be configured by a circuit such as an ASIC that performs a specific function, an independent device, or the like. The storage unit 340 may be configured by any storage medium such as an optical disk or a memory, for example.

Note that the control unit 30 may have one or more processors such as CPU and storage media such as ROM. Therefore, each component of the control unit 30 functions when at least one processor and at least one storage medium are connected, and at least one processor executes a program stored in at least one storage medium. May be configured to do so. The processor is not limited to the CPU and MPU, and may be a GPU (Graphics Processing Unit), an FPGA (Field-Programmable Gate Array), or the like.

Next, in explaining the tomographic image quality improving process according to the present embodiment, the segmentation process and the gradation conversion process will be described with reference to FIGS. 4 to 5D.

(Segmentation process)
FIG. 4 shows an example of a tomographic image in which the boundaries of the respective regions of the retinal layer have been detected by the segmentation processing. In the segmentation processing of the tomographic image, the boundary between the regions included in the tomographic image can be detected. In the tomographic image 400 shown in FIG. 4, the boundary 401 between the vitreous part and the retina and the boundary between the retina and the choroid part. 402 has been detected. By detecting the

boundaries

401 and 402 in the tomographic image 400, a region 403 of the retina between the

boundaries

401 and 402, a region 404 of the vitreous body on the shallow side of the boundary 401, and a region of the deep side from the boundary 402. An area 405 of a choroid can be identified.

Any known method can be used as the segmentation process. In one example, first, a median filter and a Sobel filter are applied to a tomographic image to be processed, and a median image and a Sobel image are generated. Next, a profile is generated for each tomographic data corresponding to the A scan from the generated median image and Sobel image. Here, the generated profile is a brightness value profile for the median image and a gradient profile for the Sobel image. Then, the peak in the profile generated from the Sobel image is detected. By referring to the profile of the median image corresponding to before and after the detected peak or between the peaks, the boundary of each region of the retinal layer can be detected.

(Gradation conversion processing)
Next, with reference to FIGS. 5A to 5D, a gradation conversion process for enhancing the contrast in the vitreous region 404 and the choroid region 405, the retina region 403, or all these regions will be described. FIG. 5A shows a tomographic image 500 as an example of an original tomographic image (hereinafter, an original tomographic image) obtained by photographing the eye E. The tomographic image 500 is usually in an integer format of 10 bits or more, and is data of a high dynamic range including information of extremely low brightness to information of high brightness. On the other hand, as described above, the data that can be displayed on the display unit 50 is low dynamic range data such as an 8-bit integer format. Therefore, the gradation conversion processing is performed so that the original tomographic image 500 has low dynamic range data for display.

FIG. 5B shows a tomographic image 501 that has been subjected to gradation conversion processing so that the region of the retina can be easily observed with respect to the original tomographic image 500, in other words, the contrast of the region of the retina is ensured. Indicates. Here, with reference to FIGS. 6A and 6B, a gradation conversion process for ensuring the contrast of the region of the retina will be described.

FIG. 6A shows the appearance frequency of the brightness value in the tomographic image 500, and shows a brightness value range 601 corresponding to the brightness value of the region of the retina. The range of the brightness value corresponding to the brightness value of the region of the retina may be determined based on an average brightness range obtained empirically for the region of the retina. In the gradation conversion process, as shown in FIG. 6B, the conversion process is performed such that the brightness range 601 corresponding to the brightness value of the region of the retina is a wide range of brightness values related to the display data. As a result, it is possible to generate the display tomographic image 501 in which the region of the retina is easily observed.

FIG. 5C shows gradation conversion for the original tomographic image 500 so that the regions of the vitreous part and the choroid part can be easily observed, in other words, the contrast of the vitreous part and the choroid part can be ensured. A tomographic image 502 that has been processed is shown. Here, with reference to FIG. 7A and FIG. 7B, a gradation conversion process for ensuring the contrast in the regions of the vitreous part and the choroid part will be described.

FIG. 7A shows the appearance frequency of the brightness value in the tomographic image 500, and shows a range 701 of brightness values corresponding to the brightness values in the regions of the vitreous part and the choroid part. The range of brightness values corresponding to the brightness values of the vitreous body portion and the choroid portion may be determined based on an empirically obtained average brightness range and the like for the vitreous portion and the choroid portion. In the gradation conversion process, as shown in FIG. 7B, the conversion process is performed so that the brightness range 701 corresponding to the brightness values of the vitreous body part and the choroid part is a wide range of brightness values related to the display data. To do. As a result, it is possible to generate the tomographic image 502 for display in which the regions of the vitreous body portion and the choroid portion can be easily observed.

FIG. 5D shows a tomographic image 503 that has been subjected to gradation conversion processing so that the regions of the retina, vitreous body, and choroid are easy to observe, in other words, the contrast of these regions is ensured. In this case, first, a boundary 401 between the vitreous part and the retina and a boundary 402 between the retina and the choroid are detected by the above-described segmentation processing, and the region 403 of the retina, the region 404 of the vitreous part, and the choroid are detected. A partial area 405 is specified.

Thereafter, for the retina area 403, as shown in FIG. 6B, gradation conversion is performed so that the range 601 of brightness values corresponding to the area of the retina becomes a wide range of brightness values relating to display data. Perform processing. On the other hand, for the region 404 of the vitreous part and the region 405 of the choroid part, as shown in FIG. 7B, the range 701 of the brightness value corresponding to the region of the vitreous part and the choroid part is the data for display. The gradation conversion processing is performed so that the brightness value is in a wide range. This makes it possible to generate a tomographic image 503 for display, in which the regions of the retina, vitreous, and choroid are easy to observe.

Note that not only the same conversion process can be performed on the vitreous part and the choroid part, but also different conversion processes can be performed on the vitreous part and the choroid part. Further, not only linear conversion processing but also S-curve conversion processing such as sigmoid conversion and γ conversion can be performed.

In the gradation conversion process for generating a tomographic image that makes it easy to observe the general regions of the retina, vitreous part, and choroid, the region in the tomographic image is detected by segmentation processing. . Therefore, in the diseased eye, if the erroneous detection due to the segmentation process occurs due to the change in the layer structure due to the lesion, the gradation conversion process may not be properly performed, and it may not be possible to generate a tomographic image in which it is easy to observe the global region. is there.

On the other hand, in the control unit 30 according to the present embodiment, different image processing is performed for each region in the tomographic image using the learned model of the machine learning model according to an arbitrary machine learning algorithm such as deep learning. Such a high-quality tomographic image that is easy to observe is generated. When performing the segmentation processing using the learned model, for example, even if the layer structure changes due to the lesion in the diseased eye, the image processing can be appropriately performed according to the learned tendency.

In the present specification, a high-quality image (high-quality image) refers to an image converted into an image of a quality suitable for image diagnosis, and a high-quality processing refers to an input image suitable for image diagnosis. It means converting to an image of high quality. Here, the content of the image quality suitable for image diagnosis depends on what is desired to be diagnosed by various image diagnosis. Therefore, although it cannot be said unequivocally, for example, the image quality suitable for image diagnosis is shown by colors and gradations that make it easy to observe the shooting target, there is little noise, high contrast, large image size, Includes image quality such as high resolution. Further, it is possible to include an image quality in which an object or gradation that does not actually exist and which is drawn in the process of image generation is removed from the image.

(Machine learning learning)
Here, the learned model according to the present embodiment will be described with reference to FIGS. 8A to 10. First, with reference to FIGS. 8A to 9B, teacher data (learning data) regarding a learned model will be described.

-Teacher data consists of pairs of one or more input data and output data. In the present embodiment, specifically, the original tomographic image such as the tomographic image 500 acquired by the OCT apparatus is used as input data, and image processing is performed so that a global observation of the tomographic image 503 or the like is possible. The teacher data is composed of a pair of groups using the tomographic image as output data. The output data can be an image obtained by performing image processing on the tomographic image that is the input data.

First, a case where one of the pair groups forming the teacher data is the original tomographic image 810 and the high-quality tomographic image 820 shown in FIGS. 8A and 8B will be described. In this case, as shown in FIGS. 8A and 8B, a pair is formed by using the entire original tomographic image 810 as input data and the entire high-quality tomographic image 820 as output data. In the example shown in FIGS. 8A and 8B, a pair of input data and output data is formed by the entire image, but the pair is not limited to this.

For example, as shown in FIGS. 9A and 9B, a rectangular area image 911 of an original tomographic image 910 is used as input data, and a rectangular area image 921 that is a corresponding imaging area in a high-quality tomographic image 920 is used as output data. You may comprise a pair. Here, the rectangular area image 911 and the rectangular area image 921 are images corresponding to each other in the tomographic image 910 and the high-quality tomographic image 920.

Note that during learning, the scan range (shooting angle of view) and scan density (the number of A scans and the number of B scans) can be normalized to make the image sizes uniform, and the rectangular area size at the time of learning can be made uniform. Further, the rectangular area images shown in FIGS. 8A to 9B are examples of rectangular area sizes when learning separately.

Also, the number of rectangular areas can be set to one in the example shown in FIGS. 8A and 8B, and can be set to a plurality in the examples shown in FIGS. 9A and 9B. For example, in the example shown in FIGS. 9A and 9B, the

rectangular area images

912 and 913 of the tomographic image 910 are used as input data, and the

rectangular area images

922 and 923 of the corresponding imaging areas in the high-quality tomographic image 920 are used as output data. You can also configure pairs. In this way, it is possible to create different pairs of rectangular area images from each pair of tomographic images and high-quality tomographic images. In the original tomographic image and the high-quality tomographic image, by creating a large number of pairs of rectangular area images while changing the position of the rectangular area to different coordinates, it is possible to enhance the pair group forming the teacher data. it can.

Here, the rectangular area image 911 is an image of the area of the retina in the original tomographic image 910, and the rectangular area image 921 is subjected to image processing such as gradation conversion processing so that global observation is possible. It is an image of the region of the retina in the high-quality tomographic image 920. Similarly, the rectangular area image 912 is an image of the vitreous body area in the original tomographic image 910, and the rectangular area image 922 is an image of the vitreous body area in the high-quality tomographic image 920. The rectangular area image 913 is an image of the area of the choroid in the original tomographic image 910, and the rectangular area image 923 is an image of the area of the choroid in the high-quality tomographic image 920.

In the example shown in FIGS. 9A and 9B, the rectangular areas are discretely shown, but the original tomographic image and the high-quality tomographic image are converted into a rectangular area image group having a constant image size and continuous without a gap. It can be divided. Further, the original tomographic image and the high-quality tomographic image may be divided into rectangular area image groups corresponding to each other at random positions. As described above, by selecting an image of a smaller area as a pair of input data and output data as a rectangular area, a large amount of pair data is generated from the tomographic image 910 and the tomographic image 920 that form the original pair. it can. Therefore, the time required to train the machine learning model can be shortened.

The output data is not limited to high-quality tomographic images generated from one tomographic image. You may use the tomographic image for a display produced | generated about the tomographic image which carried out the arithmetic mean processing etc. using the several tomographic image which image | photographed the same site | part of the to-be-tested eye several times.

Note that the rectangular area is not limited to a square and may be a rectangle. Further, the rectangular area may be one A scan. Further, when preparing output data for learning, not only is it generated by a predetermined automatic process, but also better data can be prepared by manual adjustment.

Furthermore, of the group of pairs that make up the teacher data, pairs that do not contribute to higher image quality can be removed from the teacher data. For example, when the high-quality image that is the output data forming the pair of teacher data has an image quality that is not suitable for image diagnosis, the image output by the learned model learned using the teacher data is also not suitable for image diagnosis. There is a possibility that the image quality will end up. Therefore, it is possible to reduce the possibility that the learned model will generate an image having an image quality not suitable for image diagnosis by removing from the teacher data a pair whose output data has an image quality not suitable for image diagnosis.

Also, when the structure or position of the imaged object drawn in the pair of images is significantly different, the learned model learned using the teacher data draws the imaged object at a structure or position significantly different from the input image. There is a possibility of outputting an image that is not suitable for the image diagnosis. Therefore, a pair of input data and output data, which differ greatly in the structure or position of the imaged object to be drawn, can be removed from the teacher data.

Next, as an example of the learned model according to the present embodiment, a convolutional neural network (CNN: Convolutional Neural Network) that performs image quality enhancement processing on an input tomographic image will be described with reference to FIG.

The learned model shown in FIG. 10 is composed of a plurality of layer groups that are responsible for processing the input value group and outputting it. The types of layers included in the learned model configuration 1001 include a convolutional layer, a downsampling layer, an upsampling layer, and a merging layer.

The convolution layer is a layer that performs convolution processing on the input value group according to the parameters such as the set kernel size of the filter, the number of filters, the stride value, and the dilation value. The number of dimensions of the kernel size of the filter may be changed according to the number of dimensions of the input image.

The down-sampling layer is a layer that performs processing to reduce the number of output value groups to less than the number of input value groups by thinning out or combining the input value groups. Specifically, for example, there is Max Pooling processing as such processing.

The upsampling layer is a layer that performs processing to make the number of output value groups larger than the number of input value groups by duplicating the input value group or adding values interpolated from the input value group. Specifically, such processing includes, for example, linear interpolation processing.

The composition layer is a layer that inputs a value group such as an output value group of a certain layer or a pixel value group that constitutes an image from a plurality of sources, and performs a process of concatenating or adding them to combine them.

As the parameters set in the convolutional layer group included in the configuration 1001 illustrated in FIG. 10, for example, by setting the kernel size of the filter to 3 pixels in width and 3 pixels in height, and the number of filters to 64, constant accuracy can be obtained. It is possible to improve the image quality. However, it should be noted that the degree of reproducibility of the training tendency from the teacher data to the output data may differ if the parameter settings for the layers and node groups that make up the neural network are different. In other words, in many cases, the appropriate parameter differs depending on the mode of implementation, and thus it can be changed to a preferable value as necessary.

-In addition to changing the parameters as described above, changing the configuration of the CNN may allow the CNN to obtain better characteristics. The better characteristics are, for example, that the accuracy of the image quality improvement processing is high, the time of the image quality improvement processing is short, the time required for training the machine learning model is short, and the like.

Note that the CNN configuration 1001 used in the present embodiment has a U-function that has an encoder function including a plurality of layers including a plurality of downsampling layers and a decoder function including a plurality of layers including a plurality of upsampling layers. This is a net-type machine learning model. In the U-net type machine learning model, ambiguous position information (spatial information) in a plurality of layers configured as encoders is converted into a same-dimensional layer (layers corresponding to each other) in a plurality of layers configured as decoders. ) Is used (for example, using a skip connection).

Although not shown, as a modification example of the CNN configuration, for example, a batch normalization layer or an activation layer using a normalized linear function (Rectifier Linear Unit) may be incorporated after the convolutional layer. .

When data is input to the learned model of such a machine learning model, data according to the design of the machine learning model is output. For example, output data that is likely to correspond to the input data is output according to the tendency trained using the teacher data. When an original tomographic image is input to the learned model according to this embodiment, a high-quality tomographic image that is used for global observation and in which the retina, vitreous part, and choroid part are easily observed is output.

Note that when learning is performed by dividing the area of the tomographic image, the learned model outputs a rectangular area image that is a high-quality tomographic image corresponding to each rectangular area. In this case, the image quality improving unit 322 first divides the tomographic image that is the input image into rectangular area image groups based on the image size at the time of learning, and inputs the divided rectangular area image groups to the learned model. After that, the image quality improving unit 322 sets each of the rectangular area image groups, which are high-quality tomographic images obtained by using the learned model, in the same manner as each of the rectangular area image groups input to the learned model. Arrange in a positional relationship and combine. As a result, the image quality improving unit 322 can generate a high quality tomographic image corresponding to the input tomographic image.

(flowchart)
Next, with reference to FIG. 11, a series of image processing according to the present embodiment will be described. FIG. 11 is a flowchart of a series of image processing according to this embodiment.

First, in step S1101, the acquisition unit 310 acquires tomographic information obtained by imaging the eye E to be inspected. The acquisition unit 310 may acquire the tomographic information of the eye E using the imaging unit 20 or may acquire the tomographic information from the storage unit 340 or another device connected to the control unit 30.

Here, when acquiring the tomographic information of the eye E to be inspected using the imaging unit 20, selection of an imaging mode, setting of various imaging parameters such as a scan pattern, a scan range, a focus, and a fixation lamp position, and adjustment are performed. After performing the scan, the scan of the eye E can be started.

In step S1102, the tomographic image generation unit 321 generates a tomographic image based on the acquired tomographic information of the eye E to be inspected. When the acquisition unit 310 acquires the tomographic image from the storage unit 340 or another device connected to the control unit 30 in step S1101, step S1102 may be omitted.

In step S1103, the image quality improving unit 322 uses the learned model to obtain high image quality such that different image processing is performed for each region from the tomographic image generated in step S1102 or acquired in step S1101. Generate a tomographic image.

When the learned model is learning by dividing the image area, the image quality improving unit 322 first divides the tomographic image, which is the input image, into a rectangular area image group based on the image size at the time of learning. Divide and input the divided rectangular area image group into the learned model. After that, the image quality improving unit 322 sets each of the rectangular area image groups, which are high-quality tomographic images obtained by using the learned model, in the same manner as each of the rectangular area image groups input to the learned model. A final high-quality tomographic image is generated by arranging them in a positional relationship and combining them.

In step S1104, the display control unit 350 causes the display unit 50 to display the high-quality tomographic image generated in step S1103. When the display processing by the display control unit 350 ends, a series of image processing ends.

According to such processing, it is possible to generate and display a high-quality tomographic image in which different image processing is performed on different regions using the learned model. Particularly, in the present embodiment, it is possible to generate and display an image suitable for global observation in which the contrast of the vitreous body, the choroid and the retina is emphasized even in a diseased eye.

As described above, the control unit 30 according to the present embodiment includes the acquisition unit 310 and the image quality improvement unit 322. The acquisition unit 310 acquires a first tomographic image (a tomographic image using optical interference) of the subject's eye E, which is the subject. The image quality improving unit 322 uses the learned model to generate a second tomographic image from the first tomographic image (first medical image) such that different regions in the first tomographic image are subjected to different image processing. An image (second medical image) is generated. Further, in the present embodiment, the learning data of the learned model includes a tomographic image that has been subjected to the gradation conversion processing according to the area of the eye E to be inspected.

With such a configuration, it is possible to generate and display a high-quality tomographic image in which different image processing is performed on different regions using the learned model. In particular, in the present embodiment, it is possible to obtain a display image in which the internal structures of the retina, vitreous body, and choroid can be observed in detail even when a good result cannot be obtained by segmentation of a tomographic image in a diseased eye or the like.

In addition, in the present embodiment, the image quality improving unit 322 can generate a high quality tomographic image in which each region has high image quality by using the learned model. Therefore, the image quality improving unit 322 uses the learned model and determines that the first tomographic image includes a different region between the first region and the second region different from the first region in the first tomographic image. It is possible to generate a high-quality second tomographic image. Here, for example, the first region may be a retina region and the second region may be a vitreous region. Moreover, the number of regions in which the image quality is improved is not limited to two, and may be three or more. In this case, for example, the third region, which is different from the first and second regions in which high image quality is performed, may be the region of the choroid. It should be noted that each area in which the image quality is improved may be arbitrarily changed according to a desired configuration. From this viewpoint as well, the control unit 30 according to the present embodiment can generate an image in which appropriate image processing is performed for each observation target region.

In the learned model according to the present embodiment, an image subjected to an appropriate gradation conversion process for each area is used as output data of teacher data, but the teacher data is not limited to this. For example, as the output data of the teacher data, a high-quality image obtained by performing superimposition processing such as arithmetic mean and maximum posterior probability estimation processing (MAP estimation processing) on the original image group for each area of the tomographic image You may use. Here, the original image means a tomographic image which is input data.

In the MAP estimation process, a likelihood function is obtained from the probability density of each pixel value in a plurality of images, and the true signal value (pixel value) is estimated using the obtained likelihood function. The high-quality image obtained by the MAP estimation process becomes a high-contrast image based on the pixel value close to the true signal value. In addition, since the estimated signal value is obtained based on the probability density, noise that is randomly generated is reduced in the high-quality image obtained by the MAP estimation process. Therefore, by using the learned model that has been trained with the high-quality image obtained by the MAP estimation process as the teacher data, noise is reduced from the input image and high contrast is obtained, which is suitable for image diagnosis. It is possible to generate a high quality image. A method of generating a pair of input data and output data of the teacher data may be the same as the method of using the superimposed image as the teacher data.

Also, as the output data of the teacher data, a high quality image obtained by applying a smoothing filter process using an average value filter to the original image may be used. In this case, by using the learned model, a high-quality image in which random noise is reduced can be generated from the input image. The method of generating the pair of the input data and the output data of the teacher data may be the same method as when the image subjected to the gradation conversion process is used as the teacher data.

Note that an image acquired from a photographing device having the same image quality tendency as that of the photographing unit 20 may be used as the input data of the teacher data. Further, as the output data of the teacher data, a high-quality image obtained by a high-cost process such as a successive approximation method may be used, and the subject corresponding to the input data is a photographing apparatus having higher performance than the photographing unit 20. You may use the high quality image acquired by photographing with. Further, as the output data, a high-quality image obtained by performing noise reduction processing based on a rule based on the structure of the subject may be used. Here, the noise reduction process can include, for example, a process of replacing a high-luminance pixel of only one pixel, which is apparently noise appearing in the low-luminance region, with an average value of neighboring low-luminance pixel values. . Therefore, for learning of the learned model, an image captured by an image capturing device having a higher performance than the image capturing device used to capture the input image, or an image capturing process that requires more man-hours than the input image capturing process is acquired. The image may be used as teacher data.

Further, the output data of the teacher data is used for each observation target region for an image subjected to the above-described superimposition processing, MAP estimation processing, or the like, or an image photographed by a photographing device having a higher performance than the photographing unit 20. The image may be subjected to different gradation conversion processing. Therefore, the output data of the teacher data is generated by using a combination of gradation conversion processing that differs for each observation target region, other processing related to high image quality, and a tomographic image captured by a high-performance imaging device. It may be a tomographic image. In this case, a tomographic image more suitable for diagnosis can be generated and displayed.

Also, in this embodiment, the original tomographic image is used as the input data, but the input data is not limited to this. For example, a tomographic image whose gradation is converted to facilitate observation of the retina or a tomographic image whose gradation is converted to facilitate observation of the vitreous part and choroid may be used as the input data. In this case, the image quality improving unit 322 inputs the tomographic image corresponding to the input data of the learning data, into which the gradation is converted so that the retina, the vitreous body, and the choroid are easily observed, to the learned model. A high-quality tomographic image can be generated.

Furthermore, the output data may be data with a high dynamic range adjusted to data that facilitates appropriate gradation conversion for each area. In this case, the image quality improving unit 322 can generate a high quality tomographic image by appropriately performing gradation conversion on the high dynamic range data obtained using the learned model.

Although it has been described that the image quality improving unit 322 uses the learned model to generate a high quality image in which the gradation conversion is appropriately performed for the display by the display unit 50. The image quality improving process is not limited to this. The image quality improving unit 322 is only required to be able to generate an image of image quality more suitable for image diagnosis.

In a tomographic image acquired using a trained model, it is possible that tissues such as blood vessels that do not actually exist may be drawn, or tissues that should exist may not be drawn, depending on the tendency of learning. Therefore, when displaying the high-quality tomographic image acquired using the learned model, the display control unit 350 may also display that the tomographic image is acquired using the learned model. In this case, the occurrence of erroneous diagnosis by the operator can be suppressed. Note that the display mode may be arbitrary as long as it can be understood that the image is obtained using the learned model.

(Modification 1)
In the first embodiment, a case has been described in which a partial area (rectangular area) image of a tomographic image that has been subjected to gradation conversion processing so as to allow global observation is used as output data of teacher data. On the other hand, in the first modification, a tomographic image that differs for each region to be observed is used as output data of the teacher data. Hereinafter, the teacher data in this modification will be described with reference to FIGS. 12A to 12C. Since the configuration and processing of the machine learning model according to the present modification other than the teacher data are the same as those in the first embodiment, the same reference numerals are used and description thereof is omitted.

FIG. 12A shows an example of an original tomographic image 1210 related to input data of teacher data. Further, FIG. 12A shows a rectangular region image 1212 of the vitreous region, a rectangular region image 1211 of the retina region, and a rectangular region image 1213 of the choroid region.

FIG. 12B shows a tomographic image 1220 obtained by performing gradation conversion processing on the original tomographic image 1210 so as to ensure the contrast of the region of the retina. Further, FIG. 12B shows a rectangular area image 1221 having a positional relationship with the rectangular area image 1211 of the retina area.

FIG. 12C shows a tomographic image 1230 obtained by performing gradation conversion processing on the original tomographic image 1210 so as to secure the contrast of the vitreous body portion and the choroid portion. Further, FIG. 12C shows a rectangular area image 1232 having a positional relationship with the rectangular area image 1212 of the vitreous portion area, and a rectangular area image 1233 having a positional relationship with the rectangular area image 1213 of the choroid portion area. Has been done.

In this modification, one pair of teacher data is created using the rectangular area image 1211 of the retina area in the original tomographic image 1210 as input data and the rectangular area image 1221 of the retina area in the tomographic image 1220 as output data. To do. Similarly, one pair of teacher data is created using the rectangular region image 1212 of the vitreous region in the original tomographic image 1210 as input data and the rectangular region image 1232 of the vitreous region in the tomographic image 1230 as output data. To do. Further, one pair of teacher data is created using the rectangular area image 1213 of the area of the choroid in the original tomographic image 1210 as input data and the rectangular area image 1233 of the area of the choroid in the tomographic image 1230 as output data.

Even in such a case, as the output data of the teacher data, it is possible to use a tomographic image subjected to an appropriate gradation conversion process for each observation target area. Therefore, the image quality improving unit 322 uses the learned model learned with such teacher data and performs high image quality such that different image processing is performed for each region of the observation target, as in the first embodiment. It is possible to generate various tomographic images.

(Modification 2)
In the first embodiment, a tomographic image obtained by subjecting the original tomographic image to high image quality processing such as gradation conversion processing regardless of the shooting mode is used as output data as teacher data for a machine learning model. Here, in the OCT apparatus, the tendency of the signal intensity in the tomographic image differs depending on the imaging mode. Therefore, in the second modification, the tomographic image acquired in the imaging mode in which the signal intensity of each region to be observed tends to be high is used as the output data of the teacher data.

The teacher data according to this modification will be described below with reference to FIGS. 13A to 15C. Since the configuration and processing of the machine learning model according to the present modification other than the teacher data are the same as those in the first embodiment, the same reference numerals are used and description thereof is omitted. First, as an imaging method for each imaging mode in the OCT apparatus 1, an imaging method in the vitreous mode and the choroid mode will be described.

(How to shoot in vitreous mode)
The imaging method in the vitreous mode of the OCT apparatus 1 will be described with reference to FIGS. 13A to 13C. In the vitreous body mode, as shown in FIG. 13A, a position Z1 in the depth direction (Z-axis direction) where the optical path lengths of the reference light and the measurement light match each other is shallower in the depth direction of the imaging range C10 (vitreous body side). The reference mirror 221 is moved so as to be located at the position of (1), and an image is taken.

In this case, as shown in FIG. 13B, with respect to the position Z1, a positive image is acquired in the imaging range C10 in the plus direction in the Z direction, and a virtual image is acquired in the imaging range C11 in the minus direction. Imaging in the vitreous mode of the OCT apparatus is generally performed by acquiring a normal image of the imaging range C10 as a tomographic image. Here, FIG. 13C shows a tomographic image C12 which is an example of a tomographic image acquired in the vitreous mode. The virtual image on the side of the photographing range C11 can also be acquired as the tomographic image C12. When the virtual image on the imaging range C11 side is acquired as the tomographic image C12, it may be displayed upside down.

In the OCT device, the closer to the position in the depth direction where the optical path lengths of the reference light and the measurement light match, the higher the signal intensity acquired for that area. Therefore, in the tomographic image C12 captured in the vitreous mode, the signal intensity on the side close to the position Z1, that is, on the vitreous side is high.

(How to shoot in choroid mode)
Next, with reference to FIGS. 14A to 14C, an imaging method in the choroid mode of the OCT apparatus will be described. In the choroid mode, as shown in FIG. 14A, the reference mirror is set such that the position Z2 in the depth direction where the optical path lengths of the reference light and the measurement light match with each other is located deeper in the depth direction of the imaging range (choroid side). 221 is moved to take an image.

In this case, as shown in FIG. 14B, with respect to the position Z2, a normal image is acquired in the imaging direction C20 in the negative direction in the Z direction, and a virtual image is acquired in the imaging range C21 in the positive direction. Imaging in the choroidal mode of the OCT apparatus is generally performed by acquiring a virtual image on the imaging range C21 side as a tomographic image. Here, FIG. 14C shows a tomographic image C22 that is an example of a tomographic image acquired in the choroid mode. A normal image on the side of the imaging range C20 can also be acquired as the tomographic image C22. Further, when the virtual image on the side of the imaging range C21 is acquired as the tomographic image C22, it may be displayed upside down.

As described above, in the OCT apparatus, the closer to the position in the depth direction where the optical path lengths of the reference light and the measurement light match, the higher the signal intensity acquired for the area. Therefore, in the tomographic image C22 taken in the choroid mode, the signal intensity on the side close to the position Z2, that is, on the choroid side is high.

In view of such characteristics of the OCT apparatus, in the present modification, as output data of the teacher data of the machine learning model, in an imaging mode that has a tendency that the signal intensity of the observation target region is high, especially in the region. The acquired tomographic image is used. More specifically, in the OCT apparatus, the signal intensity on the vitreous side is high in the tomographic image captured in the vitreous mode, and the signal intensity on the choroidal side is high in the tomographic image captured in the choroidal mode. Therefore, the same region of the same eye to be examined is photographed in the choroid mode and the vitreous mode, and for each partial region image (rectangular region image) of the input data, a tomographic image having a high signal intensity in the corresponding partial region is used as output data. In other words, in this modified example, the learning data of the learned model is a medical image obtained by imaging the subject, and the medical image acquired in the imaging mode corresponding to any of different regions in the medical image. including.

FIG. 15A shows an example of an original tomographic image 1510 relating to input data of teacher data, which is taken in the vitreous mode. Further, FIG. 15A shows a rectangular area image 1511 of the vitreous portion area and a rectangular area image 1512 of the choroid portion area.

FIG. 15B is a tomographic image 1520 obtained by performing gradation conversion processing on a tomographic image obtained by photographing the same region of the same eye to be examined in the vitreous mode so as to secure the contrast of the regions of the retina, vitreous, and choroid. Is shown. Further, FIG. 15B shows a rectangular area image 1521 having a positional relationship with the rectangular area image 1511 of the vitreous body area.

FIG. 15C shows a tomographic image 1530 obtained by performing gradation conversion processing on a tomographic image of the same site of the same eye to be examined in the choroidal mode so as to ensure the contrast of the regions of the retina, vitreous part, and choroid. Shows. Also, FIG. 15C shows a rectangular area image 1532 having a positional relationship with the rectangular area image 1512 of the area of the choroid.

In this modification, a rectangular area image 1511 of the vitreous body area in the original tomographic image 1510 is used as input data, and a rectangular area image 1521 of the vitreous body area in the tomographic image 1520 is used as output data. To create. Similarly, one pair of teacher data is created using the rectangular area image 1512 of the choroidal area in the original tomographic image 1510 as input data and the rectangular area image 1532 of the choroidal area in the tomographic image 1530 as output data. In this modification, since the tomographic image 1530 captured in the choroid mode is vertically inverted from the original tomographic image 1510 related to the input data, the rectangular area image obtained by vertically inverting the rectangular area image 1532 is output data of the teacher data. Used as.

In such a case, as the output data of the teacher data, the gradation corresponding to the region is applied to the tomographic image corresponding to the region of the observation target, particularly, the tomographic image acquired in the imaging mode in which the signal intensity of the region tends to be high. A tomographic image that has undergone the conversion process can be used. In other words, the learning data of the learned model is a medical image obtained by photographing the subject, and the medical image acquired in the photographing mode corresponding to any of different regions in the medical image It may include a medical image that has been subjected to gradation conversion processing corresponding to any of different regions in the medical image. The image quality improving unit 322 can generate a tomographic image with higher image quality for each region of the observation target by using the learned model learned by such teacher data.

Note that the input data of the teacher data is not limited to the original tomographic image taken in the vitreous mode, and may be the original tomographic image taken in the choroid mode. In this case, since the tomographic image taken in the vitreous mode is vertically inverted from the original tomographic image related to the input data, an image obtained by vertically inverting the rectangular area image related to the tomographic image taken in the vitreous mode is output data of the teacher data. Used as.

In addition, the gradation conversion process applied to the tomographic image captured in each imaging mode ensures the contrast of the retina portion, the vitreous portion, and the choroid portion so that a global observation can be performed. It is not limited to such gradation conversion processing. For example, as in the first modification, a tomographic image captured in the vitreous mode is subjected to gradation conversion so as to ensure the contrast of the vitreous region, and the tomographic image is used as output data of the teacher data. Good. Similarly, with respect to the tomographic image captured in the choroid mode, a tomographic image that has been subjected to gradation conversion so as to ensure the contrast of the region of the choroid may be used as the output data of the teacher data.

Output data based on a tomographic image captured in the vitreous mode or output data based on a tomographic image captured in the choroid mode may be used as the output data of the teacher data regarding the region of the retina. . Further, the photographing mode is not limited to the vitreous mode and the choroid mode, and may be arbitrarily set according to a desired configuration. Also in this case, based on the tendency of the signal intensity in the tomographic image according to the imaging mode, a tomographic image having a tendency that the signal intensity of the region is high is used as the output data of the teacher data for each region of the observation target. be able to.

Also in the modified examples 1 and 2, the input data of the teacher data is not limited to the original tomographic image as in the first embodiment, and may be a tomographic image subjected to arbitrary gradation conversion. Further, the output data of the teacher data is not limited to the tomographic image subjected to the gradation conversion, and may be a tomographic image adjusted so that the gradation conversion is easily performed on the original tomographic image.

(Modification 3)
In the first embodiment, the image quality improving unit 322 uses one learned model to generate a high quality image in which different image processing is performed for each region of the target image. On the other hand, in the modified example 3, first, the image quality improving unit 322, for the tomographic image serving as the input data, the label image in which the region is labeled (annotated) for each pixel using the first learned model. To generate. After that, the image quality improving unit 322 uses the second learned model different from the first learned model for the generated label image to generate a high quality image that has been subjected to image processing according to the region. To do. In other words, the image quality improving unit 322 uses a learned model that is different from the learned model for generating the high quality image (second medical image) to be used as the input data tomographic image (first medical image). ) To generate label images with different label values for different areas. Further, the image quality improving unit 322 generates a high quality image from the label image using the learned model for generating the high quality image (second medical image).

In this modification, the first learned model is trained by using the tomographic image as the input data and the teacher data as the output data of the label image in which the region is labeled for each pixel of the tomographic image. As the label image, an image appropriately processed by a conventional segmentation process may be used, or a manually labeled label image may be used. The label may be, for example, a vitreous label, a retina label, a choroid label, or the like. The label may be represented by a character string, or may be a numerical value or the like corresponding to each preset area. Moreover, the label is not limited to the above example, and may indicate an arbitrary area according to a desired configuration.

Further, with respect to the second learned model, learning is performed using teacher data in which a label image is used as input data and a tomographic image obtained by subjecting the label image to high image quality processing according to a label for each pixel is output data. . Note that the image quality improvement processing according to the label for each pixel may include the gradation conversion processing according to the region of the observation target as described above.

In such a case, the image quality improving unit 322 uses the first and second learned models to perform high image processing that is different for each region of the observation target, as in the first embodiment. It is possible to generate a high-quality tomographic image. Further, the learned model outputs output data that is highly likely to correspond to the input data according to the learning tendency. In this regard, the learned model, when learning is performed using an image group having a similar image quality tendency as teacher data, outputs an image having a higher image quality more effectively with respect to the image having the similar tendency. be able to. Therefore, as in this modification, by using the learned model that uses the teacher data labeled for each area, it can be expected that an image with high image quality can be generated more effectively.

As for the teacher data according to this modification, the entire image may be used as in the first embodiment, or the rectangular area image (partial image) may be used. Further, the input data and the output data may be an image after any gradation conversion or an image before gradation conversion depending on a desired configuration.

(Modification 4)
In the first embodiment, the case where the image quality improving unit 322 integrates the partial images of the tomographic images obtained by using the learned model to generate the final high image quality tomographic image has been described. Particularly, in the example described in the first embodiment, the partial image obtained by using the learned model is an image in which different gradation conversion processing is performed for each region of the observation target according to the tendency of learning. Therefore, if the partial images are simply integrated, the distribution of the luminance is different between the area where the different areas are in contact (the connection area) and the area adjacent to this area (for example, the vitreous area or the retina area). Notably, the image edges may be noticeable.

Therefore, in the fourth modification, when the image quality improving unit 322 integrates the partial images obtained by using the learned model, the pixel values of the connected portion of the observation target area are based on the pixel values of the surrounding pixels. And make corrections so that the image edges are not noticeable. As a result, it is possible to generate an image suitable for diagnosis, in which discomfort due to an image edge is reduced.

In this case, the image quality improving unit 322 can correct the brightness value by performing a known arbitrary blending process on the connection portion of the observation target region. Note that the image quality improving unit 322 may perform blending processing on a portion adjacent to the connection portion of the observation target region. Further, the process of making the image edge inconspicuous is not limited to the blending process, and may be any other process.

(Example 2)
In the first embodiment, the generated / acquired tomographic image is uniformly subjected to the high image quality processing using the learned model. On the other hand, in the OCT apparatus according to the second embodiment, the image processing to be applied to the tomographic image is selected according to the instruction of the operator.

The OCT apparatus according to this embodiment will be described below with reference to FIGS. 16 to 18C. Since the configuration other than the control unit according to the present embodiment is the same as that of the OCT apparatus 1 according to the first embodiment, the same reference numerals are used and the description thereof is omitted. Hereinafter, the OCT apparatus according to the present embodiment will be described focusing on the differences from the OCT apparatus according to the first embodiment.

FIG. 16 shows a schematic configuration example of the control unit 1600 according to this embodiment. In the control unit 1600, the configuration of the image processing unit 1620 other than the image quality improving unit 1622 and the selection unit 1623 is the same as the configuration of the control unit 30 according to the first embodiment, and thus the description will be given using the same reference numerals. Omit it.

The image processing unit 1620 is provided with an image quality improving unit 1622 and a selecting unit 1623 in addition to the tomographic image generating unit 321. The selection unit 1623 selects image processing to be applied to the tomographic image according to the instruction from the operator input via the input unit 40.

The image quality improving unit 1622 applies the image processing selected by the selecting unit 1623 to the tomographic image generated by the tomographic image generating unit 321 or the tomographic image acquired by the acquiring unit 310 to generate a high quality tomographic image. To do.

Next, a series of image processing according to the present embodiment will be described with reference to FIG. FIG. 17 is a flowchart of a series of image processing according to this embodiment. Note that steps S1701 and S1702 are the same as steps S1101 and S1102 according to the first embodiment, and a description thereof will be omitted.

When the tomographic image generation unit 321 generates the original tomographic image in step S1702, the process proceeds to step S1703. In step S1703, the acquisition unit 310 acquires an instruction from the operator regarding the selection of the process to be performed on the region of interest in the tomographic image or the tomographic image. At this time, the display control unit 350 can display the processing options on the display unit 50 and present the options to the operator.

In step S1704, the selection unit 1623 selects image processing (image quality enhancement processing) to be applied to the tomographic image according to the instruction from the operator acquired in step S1703. In the present embodiment, the selection unit 1623 selects image quality enhancement processing for the retina, vitreous / choroid membrane quality enhancement processing, or image quality enhancement processing for the entire image in response to an instruction from the operator.

When the image quality improving process for the retina is selected in step S1704, the process proceeds to step S1705. In step S1705, the image quality improving unit 1622 performs gradation conversion processing on the original tomographic image so that the above-described region of the retina can be easily observed, and generates a high image quality tomographic image.

In step S1704, when the image quality enhancement process for the vitreous / choroid is selected, the process proceeds to step S1706. In step S1704, the image quality improving unit 1622 performs a gradation conversion process on the original tomographic image so that the regions of the vitreous part and the choroid part as described above can be easily observed, and a high quality tomographic image is generated. To do.

When the image quality improving process for the entire image is selected in step S1704, the process proceeds to step S1707. In step S1707, the image quality improving unit 1622 uses the learned model for the original tomographic image to generate a high image quality tomographic image in which the retina, vitreous body, and choroid are easy to observe. Since the learned model according to the present embodiment is the same as the learned model according to the first embodiment, description regarding the learned model and learning data will be omitted.

In step S1708, the display control unit 350 causes the display unit 50 to display the high-quality tomographic image generated in step S1705, step S1706, or step S1707. When the display processing by the display control unit 350 ends, a series of image processing ends.

Here, the operation method according to the present embodiment will be described with reference to FIGS. 18A to 18C. 18A to 18C show an example of a display screen including a tomographic image that has undergone image quality enhancement processing according to the option of the region of interest and the selected region.

FIG. 18A shows a display screen 1800 when the retina region is selected as the region of interest. On the display screen 1800, a tomographic image 1802 that has been subjected to gradation conversion processing so that the option 1801 and the region of the retina can be easily observed is displayed.

When the operator desires a region of the retina as a region to be noticed, the operator uses the input unit 40 to select a retina from the three options of retina, vitreous / choroid, and overall retina. Select. The selecting unit 1623 selects the image quality improving process for the region of the retina according to the instruction from the operator, the image quality improving unit 1622 applies the selected image quality improving process for the tomographic image, and the retina region is observed. A tomographic image 1802 that is easy to perform is generated. The display control unit 350 displays the generated tomographic image 1802 on the display screen 1800 so that the retina portion can be easily observed.

FIG. 18B shows a display screen 1810 when the vitreous portion and the choroid portion are selected as the areas to be noticed. On the display screen 1810, a tomographic image 1812 that has been subjected to gradation conversion processing so that the options 1811 and the regions of the vitreous part and the choroid part can be easily observed are displayed.

When the operator desires the regions of the vitreous part and the choroid part as the regions to be focused on, the operator selects the three regions of the retina, the vitreous / choroid, and the whole in the option 1801 via the input unit 40. From the options, select the vitreous / choroid. The selecting unit 1623 selects the image quality improving process for the regions of the vitreous part and the choroid part according to the instruction from the operator, and the image quality improving unit 1622 applies the image quality improving process selected for the tomographic image, A high-quality tomographic image 1812 that allows easy observation of the vitreous body and choroid is generated. The display control unit 350 displays on the display screen 1810 a tomographic image 1812 in which the generated vitreous body and choroid can be easily observed.

FIG. 18C shows the display screen 1820 when the entire area is selected as the area of interest. On the display screen 1820, a tomographic image 1822 that has been subjected to gradation conversion processing so that the options 1821 and the entire region can be easily observed is displayed.

When the operator desires the entire area as the area to be noticed, the operator selects the entire area from the three options of the retina, vitreous / choroid, and the entire area in the option 1821 via the input unit 40. select. The selecting unit 1623 selects the image quality improving process for the entire image in accordance with the instruction from the operator, and the image quality improving unit 1622 applies the image quality improving process selected for the tomographic image to generate a high image quality tomographic image. To generate. In this case, the image quality improving unit 1622 uses the learned model to generate a high quality tomographic image that makes it easy to observe the entire image. The display controller 350 displays the generated tomographic image 1822 on the display screen 1820 so that the entire region is easy to observe.

As described above, the control unit 1600 according to the present embodiment includes the selection unit 1623 that selects the image processing to be applied to the first tomographic image acquired by the acquisition unit 310 according to the instruction from the operator. Based on the image processing selected by the selecting unit 1623, the image quality improving unit 1622 performs the gradation conversion process on the first tomographic image without using the learned model, and the third tomographic image (third medical image ) Is generated, or a second tomographic image is generated from the first tomographic image using the learned model.

With such a configuration, the control unit 1600 can observe a tomographic image that has undergone different image processing depending on the region that the operator wants to pay attention to. In particular, as described above, in the image quality enhancement process using the learned model, a tissue that does not actually exist may be drawn, or a tissue that originally exists may disappear. Therefore, erroneous diagnosis can be prevented by comparing and observing tomographic images subjected to different image processing.

Further, as described above, the gradation conversion processing for facilitating observation of the retina area and the gradation conversion processing for facilitation of observation of the vitreous and choroidal areas are not premised on the segmentation processing. . Therefore, appropriate image quality improvement processing can be expected even in a diseased eye.

In the present embodiment, an example has been described in which after the operator's instruction regarding the area to be noted is acquired in step S1703, image processing according to the instruction is performed. However, the order of obtaining the instruction from the operator and the image processing is not limited to this. In advance, the image quality improving unit 1622 performs image processing of all options on the original tomographic image to generate high-quality tomographic images for each, and displays the high-quality tomographic image according to the instruction of the operator. It is also possible to switch only the above. In this case, the selection unit 1623 can function as a selection unit that selects a high-quality tomographic image to be displayed.

In addition, preset image processing (default image processing) is applied to the original tomographic image to generate a high-quality tomographic image, and after displaying the high-quality tomographic image, an instruction from the operator is acquired. You may. In this case, if an instruction is received from the operator regarding image processing other than the default image processing, a new high quality image subjected to image processing according to the instruction can be displayed.

Note that the example in which the same image processing is performed in the vitreous region and the choroid region has been described, but different image processing may be performed in the vitreous region and the choroid region.

Also, the image processing is not limited to the image quality improvement processing for the retina area, the image quality improvement processing for the vitreous area and the choroid area, and the image quality improvement processing using the learned model. For example, the gradation conversion processing that facilitates observation of the regions of the retina, vitreous body, and choroid, which is based on the segmentation processing as described above, may be included in the image processing options. In this case, the high-quality tomographic image generated by the image processing based on the segmentation processing and the high-quality tomographic image generated by the image processing using the learned model can be compared and observed. Therefore, the operator can easily determine the false detection due to the segmentation process and the authenticity of the tissue in the tomographic image generated using the learned model.

(Example 3)
In the first embodiment, the image subjected to the high image quality processing is displayed using the learned model. On the other hand, in the OCT apparatus according to the third embodiment, different analysis conditions are applied to each of a plurality of different regions in the generated high-quality tomographic image, image analysis is performed, and the image result is displayed.

The OCT apparatus according to this embodiment will be described below with reference to FIGS. 19 and 20. Since the configuration other than the control unit according to the present embodiment is the same as that of the OCT apparatus 1 according to the first embodiment, the same reference numerals are used and the description thereof is omitted. Hereinafter, the OCT apparatus according to the present embodiment will be described focusing on the differences from the OCT apparatus according to the first embodiment.

FIG. 19 shows a schematic configuration example of the control unit 1900 according to this embodiment. In the control unit 1900, the configuration other than the analysis unit 1924 of the image processing unit 1920 is the same as the configuration of the control unit 30 according to the first embodiment, and thus the same reference numerals are used and the description thereof is omitted.

The image processing unit 1920 is provided with an analysis unit 1924 in addition to the tomographic image generation unit 321 and the image quality improvement unit 322. The analysis unit 1924 performs image analysis on the high-quality tomographic image generated by the high-quality image generation unit 322 based on the analysis condition set for each region. Here, as the analysis condition set for each region, for example, layer extraction or blood vessel extraction is set in the retina region or choroid region, and detection of vitreous or vitreous detachment is set in the vitreous region. To be done. The analysis conditions may be set in advance or may be set appropriately by the operator.

When the layer extraction is set as the analysis condition, the analysis unit 1924 can perform the layer extraction for the region for which the analysis condition is set, and perform the layer thickness value measurement or the like for the extracted layer. Further, when the blood vessel extraction is set as the analysis condition, the analysis unit 1924 can perform the blood vessel extraction on the region for which the analysis condition is set, and can perform the blood vessel density measurement or the like on the extracted blood vessel. . Furthermore, when the detection of the vitreous body or the separation of the vitreous body is set as the analysis condition, the analysis unit 1924 detects the vitreous body or the separation of the vitreous body in the region for which the analysis condition is set. After that, the analysis unit 1924 can quantify the detected vitreous body and peeling of the vitreous body, and can obtain the thickness, width, area, volume, and the like of the vitreous body and peeling of the vitreous body.

Note that the analysis conditions are not limited to these, and may be set arbitrarily according to the desired configuration. For example, detection of the fibrous structure of the vitreous for the region of the vitreous part may be set. In this case, the analysis unit 1924 can quantify the detected fibrous structure of the vitreous and determine the thickness, width, area, volume, etc. of the fibrous structure. Further, the analysis process according to the analysis condition is not limited to the above process and may be arbitrarily set according to a desired configuration.

The display control unit 350 causes the display unit 50 to display the result of the image analysis performed by the analysis unit 1924 together with the high-quality tomographic image or separately from the high-quality tomographic image.

Next, a series of image processing according to this embodiment will be described with reference to FIG. FIG. 20 is a flowchart of a series of image processing according to this embodiment. Note that steps S2001 to S2003 are the same as steps S1101 to S1103 according to the first embodiment, and thus description thereof will be omitted.

In step S2003, when the image quality improving unit 322 generates a high quality tomographic image as in step S1103, the process proceeds to step S2004. In step S2004, the analysis unit 1924 performs segmentation processing on the generated high-quality tomographic image and detects a plurality of different regions in the tomographic image. The analysis unit 1924 can detect, for example, a vitreous region, a retina region, a choroid region, and the like as the plurality of regions. Any known method can be used as the method of the segmentation processing, and for example, the segmentation processing may be a rule-based segmentation processing. Here, the rule-based processing refers to processing using known regularity, for example, known regularity such as regularity of retina shape.

After that, the analysis unit 1924 performs image analysis on each area based on the analysis condition set for each detected area. For example, the analysis unit 1924 performs layer extraction or blood vessel extraction on the region for which the analysis condition is set according to the analysis condition, and calculates the layer thickness or the blood vessel density. The layer extraction and the blood vessel extraction may be performed by any known segmentation process or the like. In addition, the analysis unit 1924 may detect the vitreous body, the vitreous body exfoliation, and the vitreous body fiber structure in accordance with the analysis conditions, and perform quantification thereof. Note that the analysis unit 1924 can perform further contrast enhancement, binarization, morphology processing, boundary line tracking processing, and the like when detecting the vitreous body and the vitreous body exfoliation and the vitreous body fiber structure.

In step S2005, the display control unit 350 uses the analysis results (eg, layer thickness, blood vessel density, vitreous area, etc.) analyzed by the analysis unit 1924 to generate high-quality tomographic images generated by the high-quality image generation unit 322. The image is displayed on the display unit 50 together with the image. The display mode of the analysis result may be any mode according to the desired configuration. For example, the display control unit 350 may display the analysis result of each area in association with each area of the high-quality tomographic image. Further, the display control unit 350 may display the analysis result on the display unit 50 separately from the high quality tomographic image. When the display processing by the display control unit 350 ends, a series of image processing ends.

As described above, the control unit 1900 according to the present embodiment applies different analysis conditions to each of different areas in the high-quality tomographic image (second tomographic image) generated by the image quality improving unit 322, and The analysis part 1924 which performs an analysis is provided. The display control unit 350 causes the display unit 50 to display the analysis result of each of the different regions in the high-quality tomographic image by the analysis unit 1924.

According to this configuration, the analysis unit 1924 performs image analysis on the high-quality tomographic image generated by the high-quality image generation unit 322, so that features and the like in the image are detected more appropriately, and more accurate image analysis is performed. It can be performed. In addition, the analysis unit 1924 performs an image analysis on a high-quality tomographic image obtained by performing appropriate image processing for each region in accordance with the analysis conditions set for each region, thereby obtaining an appropriate analysis result for each region. Can be output. Therefore, the operator can quickly obtain an appropriate analysis result for the eye to be inspected.

In this embodiment, the analysis unit 1924 automatically performs image analysis on high-quality tomographic images according to the analysis conditions for each region. On the other hand, the analysis unit 1924 may start image processing on a high-quality tomographic image in response to an instruction from the operator.

The analysis unit 1924 according to the present embodiment may be applied to the control unit 1600 according to the second embodiment. In this case, the analysis unit 1924 may perform the above-described image analysis on the tomographic images generated in steps S1705 to S1707, or may perform only the image processing on the region to be observed selected in step S1704. Good. Note that, when the segmentation process is performed when the image quality is improved, the analysis unit 1924 can perform the above-described image analysis on the high-quality tomographic image using the result of the segmentation process.

Furthermore, in the present embodiment, the analysis unit 1924 performs segmentation processing on the high-quality tomographic image generated by the image quality enhancement unit 322, and detects different areas. On the other hand, for example, when the analysis unit 1924 is applied to the control unit according to the modified example 3 of the first embodiment, the analysis unit 1924 uses the label image obtained by using the first learned model, A plurality of different areas in a high-quality tomographic image may be grasped.

(Modification 5)
Further, the

image processing units

320, 1620, and 1920 may generate a label image using a learned model for segmentation for the tomographic image and perform the segmentation process. Here, the label image means a label image in which a region label is attached to each pixel in the tomographic image as described above. Specifically, it is an image in which an arbitrary region of the region group drawn in the image is divided by a group of pixel values (hereinafter, label value) that can be specified. Here, the specified arbitrary region includes a region of interest (ROI: Region Of Interest) and a volume of interest (VOI: Volume Of Interest).

By specifying the coordinate group of pixels with an arbitrary label value from the image, you can specify the coordinate group of pixels that depict the corresponding region such as the retina layer in the image. Specifically, for example, when the label value indicating the ganglion cell layer forming the retina is 1, the coordinate group having a pixel value of 1 is specified from the pixel group of the image, and the coordinate group is associated with the image. The pixel group to be extracted is extracted. Thereby, the region of the ganglion cell layer in the image can be specified.

Note that the segmentation processing may include processing for performing reduction or enlargement processing on the label image. At this time, the image complement processing method used for reducing or enlarging the label image uses a nearest neighbor method or the like that does not erroneously generate an undefined label value or a label value that should not exist at the corresponding coordinates. .

Here, the segmentation process will be explained in more detail. The segmentation process is a process of identifying a region called ROI or VOI such as an organ or a lesion depicted in an image for use in image diagnosis or image analysis. For example, according to the segmentation process, the region group of the layer group that configures the retina can be specified from the image acquired by the OCT imaging in which the posterior segment of the eye is the imaging target. Note that the number of specified regions is 0 if the region to be specified in the image is not drawn. Further, as long as a plurality of region groups to be specified in the image are drawn, the number of specified regions may be plural, or may be one region surrounding the region group. Good.

The specified area group is output as information that can be used in other processing. Specifically, for example, the coordinate group of the pixel groups forming each of the specified region groups can be output as a numerical data group. Further, for example, a coordinate group indicating a rectangular area, an elliptical area, a rectangular area, an ellipsoidal area or the like including each of the specified area groups can be output as a numerical data group. Furthermore, for example, a coordinate group indicating a straight line, a curved line, a plane, a curved surface, or the like, which is the boundary of the specified region group, can be output as a numerical data group. Further, for example, a label image showing the specified area group can be output.

Here, as a machine learning model for segmentation, for example, a convolutional neural network (CNN) can be used. As a machine learning model according to this modification, for example, a CNN (U-net type machine learning model) as shown in FIG. 10 or a model combining CNN and LSTM (Long short-term memory) is used. Can be used. Further, FCN (Fully Concurrent Network), SegNet, or the like can be used as the machine learning model. Furthermore, a machine learning model or the like that performs object recognition in area units can be used according to the desired configuration. As a machine learning model for recognizing an object in a region unit, for example, RCNN (Region CNN), fastRCNN, or fastRCNN can be used. Further, YOLO (You Only Look Once), SSD (Single Shot Detector, or Single Shot MultiBox Detector) can be used as a machine learning model for recognizing objects in units of areas. The machine learning model illustrated here may be applied to the first learned model described in the third modification.

Also, the learning data of the machine learning model for segmentation uses a tomographic image as input data, and a label image in which a region label is attached to each pixel of the tomographic image as output data. Label images include, for example, inner limiting membrane (ILM), nerve fiber layer (NFL), ganglion cell layer (GCL), photoreceptor inner segment outer segment junction (ISOS), retinal pigment epithelium layer (RPE), Bruch. Labeled images with labels such as membrane (BM) and choroid can be used. In addition, as other regions, for example, vitreous, sclera, outer plexiform layer (OPL), outer granule layer (ONL), inner plexiform layer (IPL), inner granule layer (INL), cornea, anterior chamber, iris, Alternatively, an image with a label such as a crystalline lens may be used. Note that the label image illustrated here may be used as output data of learning data regarding the first learned model described in Modification 3.

Also, the input data of the machine learning model for segmentation is not limited to the tomographic image. It may be an anterior segment image, an SLO fundus image, a fundus front image obtained by using a fundus camera, or an En-Face image or an OCTA front image described later. In this case, as the learning data, various images can be used as input data, and a label image in which a region name or the like is labeled for each pixel of various images can be used as output data. For example, when the input data of the learning data is a fundus front image, the output data may be an image labeled with a peripheral portion of the optic disc, Disc, and Cup. The input data may be an image with high image quality or an image without high image quality.

The label image used as the output data may be an image in which each region is labeled in a tomographic image by a doctor or the like, or an image in which each region is labeled by a rule-based region detection process. It may be. However, if machine learning is performed using label images that have not been appropriately labeled as output data for training data, images obtained using a trained model trained using the training data will also be labeled appropriately. May result in a label image that has not been processed. Therefore, by removing the pair including such a label image from the learning data, it is possible to reduce the possibility that an inappropriate label image is generated using the learned model. Here, the rule-based area detection process refers to a detection process that uses a known regularity such as the regularity of the shape of the retina.

The

image processing units

320, 1620, and 1920 can be expected to detect a specific area in various images at high speed and with accuracy by performing segmentation processing using such a learned model for segmentation. The learned model for segmentation may be used as the first learned model described in Modification 3. Further, in the third embodiment, the analysis unit 1924 may perform the segmentation process using the learned model according to this modification.

Note that a trained model for segmentation may be prepared for each type of various images that are input data. Furthermore, the learned model for segmentation may be one that has been trained on images for each imaging region (for example, the center of the macula, the center of the optic disc), or it may be learned regardless of the imaging region. Good.

Also, when generating an En-Face image or OCTA front image, the depth range is set and specified as described below. Therefore, for the En-Face image and the OCTA front image, a learned model may be prepared for each depth range for generating the image.

Note that the

image processing units

320, 1620, and 1920 perform rule-based segmentation processing or segmentation processing using a learned model on at least one of the images before and after the image

quality improvement units

322 and 1622 perform the image quality improvement processing. It can be performed. As a result, the image processing unit 320 can identify different regions in the at least one image. In particular, the

image processing units

320, 1620, and 1920 use a learned model for segmentation (third learned model) different from the learned model for generating a high-quality image (second medical image). , Perform segmentation processing. As a result, it can be expected that a different region in at least one of the images can be specified accurately at high speed.

(Modification 6)
The high-quality image obtained by using the learned model by the image

quality improving units

322 and 1622 according to the above-described embodiment and modification may be manually corrected according to the instruction from the operator. For example, the image quality improvement model may be updated by additional learning using, as learning data, a high quality image in which image processing of a designated area is changed, in response to an instruction from the examiner. In this case, for example, in the high-quality image generated by using the high-quality model, the gradation conversion process is performed on the retina in the region where the gradation conversion process is performed on the vitreous part and the choroid part. The image can be used as learning data for additional learning. On the contrary, in the high-quality image generated by using the high-quality image model, an image corrected so that the gradation conversion process is performed on the vitreous part and the choroid part in the region where the gradation conversion process is performed on the retina part. Can be used as learning data for additional learning.

Also, the image quality improvement model may be updated by additional learning using the value of the ratio set (changed) according to the instruction from the examiner as the learning data. For example, if the examiner tends to set a high ratio of the input image to the high-quality image when the input image is relatively dark, the learned model is additionally learned so as to have such a tendency. Thereby, for example, it can be customized as a learned model that can obtain a composition ratio that matches the taste of the examiner.

At this time, a button may be displayed on the display screen for deciding whether or not to use the set (changed) value of the proportion as learning data for additional learning in response to an instruction from the examiner. Accordingly, the

control units

30, 1600, 1900 can determine the necessity of additional learning according to the instruction of the operator. Alternatively, the ratio determined using the learned model may be set as a default value, and then the ratio value may be changed from the default value in response to an instruction from the examiner.

Note that, as described later, the trained model can be provided in a device such as a server. In this case, the

control unit

30, 1600, 1900 sets the input image and the above-described corrected high-quality image as a pair of learning data in accordance with an instruction from the operator to perform additional learning. It can be transmitted and saved in the server or the like. In other words, the

control units

30, 1600, 1900 can determine whether or not to transmit the learning data of the additional learning to the device such as the server including the learned model according to the instruction of the operator.

Note that, with respect to the various learned models described in the above-described embodiments and other modified examples, additional learning may be performed by similarly using the data manually corrected according to the instruction of the operator as the learning data. Further, the determination of the necessity of additional learning and the determination of whether to transmit the data to the server may be performed by the same method. Also in these cases, it can be expected that the accuracy of each processing can be improved and that processing according to the tendency of the examiner's preference can be performed.

For example, for a trained model for segmentation, additional learning may be performed using the data manually corrected according to the operator's instruction as the learning data. Further, the determination as to whether additional learning is necessary or whether to transmit data to the server may be performed by the same method as the above method. Also in these cases, it can be expected that the accuracy of the segmentation process is improved and that the process according to the preference of the examiner can be performed.

(Modification 7)
In each of the above-described embodiments and modifications, the

image processing units

320, 1620, and 1920 can also generate an En-Face image or OCTA front image of the eye to be inspected using the three-dimensional tomographic image. In this case, the display control unit 350 can display the generated En-Face image or OCTA image on the display unit 50. The analysis unit 1924 can also analyze the generated En-Face image and OCTA image.

Here, the En-Face image and the OCTA front image will be described. The En-Face image is a front image generated by projecting data in an arbitrary depth range in a three-dimensional tomographic image obtained by using optical interference in the XY directions. The front image is a depth range of at least a part of volume data (three-dimensional tomographic image) obtained by using optical interference, and data corresponding to the depth range determined based on the two reference planes is a two-dimensional plane. It is generated by projecting on or integrating with.

For example, the En-Face image is generated by projecting onto a two-dimensional plane data corresponding to a depth range determined based on the retinal layer detected by the segmentation processing of the two-dimensional tomographic image in the volume data. You can As a method of projecting the data corresponding to the depth range determined based on the two reference planes onto the two-dimensional plane, for example, the representative value of the data within the depth range is set as the pixel value on the two-dimensional plane. Techniques can be used. Here, the representative value can include a value such as an average value, a median value, or a maximum value of pixel values within a range (depth range) in the depth direction of a region surrounded by two reference planes.

The depth range related to the En-Face image is based on, for example, two layer boundaries regarding the retinal layer detected by the above-described rule-based segmentation processing method or the segmentation processing using the trained model described in Modification 5. May be specified. Further, the depth range may be a range including a predetermined number of pixels in a deeper direction or a shallower direction with reference to one of two layer boundaries regarding the retinal layer detected by these segmentation processes. In addition, the depth range related to the En-Face image may be, for example, a range that is changed (offset) in accordance with an operator's instruction from a range between two layer boundaries related to the detected retinal layer. Good. At this time, the operator moves, for example, an index indicating the upper limit or the lower limit of the depth range, which is superimposed on the tomographic image whose image quality has been improved by the image

quality improving units

322 and 1622 or which has not been imaged. The depth range can be changed by, for example,

Note that the generated front image is not limited to the En-Face image based on the brightness value (En-Face image of brightness) as described above. The generated front image may be, for example, a motion contrast front image generated by projecting or integrating data corresponding to the above-described depth range on a two-dimensional plane for motion contrast data between a plurality of volume data. Here, the motion contrast data is data indicating a change between a plurality of volume data obtained by controlling the measurement light to be scanned a plurality of times in the same region (same position) of the eye to be inspected. At this time, the volume data is composed of a plurality of tomographic images obtained at different positions. Then, the motion contrast data can be obtained as the volume data by obtaining the data indicating the change between the plurality of tomographic images obtained at the substantially same position at each of the different positions. The motion contrast front image is also referred to as an OCTA front image (OCTA En-Face image) regarding OCT angiography (OCTA) for measuring the movement of blood flow, and the motion contrast data is also referred to as OCTA data. The motion contrast data can be obtained, for example, as a decorrelation value between two tomographic images or corresponding interference signals, a variance value, or a value obtained by dividing the maximum value by the minimum value (maximum value / minimum value). , May be obtained by any known method. At this time, the two tomographic images can be obtained, for example, by controlling so that the measurement light is scanned a plurality of times in the same region (same position) of the subject's eye.

The three-dimensional OCTA data (OCT volume data) used when generating the OCTA front image is generated by using at least a part of the interference signal common to the volume data including the tomographic image used for image segmentation. Good. In this case, the volume data (three-dimensional tomographic image) and the three-dimensional OCTA data can correspond to each other. Therefore, by using the three-dimensional motion contrast data corresponding to the volume data, for example, a motion contrast front image corresponding to the depth range determined based on the retinal layer detected by the image segmentation can be generated.

Note that the volume data used when generating the En-Face image or the OCTA front image may be composed of tomographic images of which the image quality is improved by the image

quality improving units

322 and 1622. In other words, the

image processing units

320, 1620, and 1920 may generate an En-Face image or an OCTA front image by using the volume data composed of a plurality of tomographic images obtained at a plurality of different positions with high image quality. Good. In other words, when the images before and after the image quality enhancement processing by the image

quality enhancement units

322 and 1622 are three-dimensional OCT tomographic images, the

image processing units

320, 1620, and 1920 display one of the images after the image quality enhancement process. A front image corresponding to the depth range of the part can be generated. As a result, the

image processing units

320, 1620, and 1920 can generate a high-quality front image based on the high-quality three-dimensional tomographic image.

(Modification 8)
Next, an image processing apparatus according to Modification 8 will be described with reference to FIGS. 21A to 23. In the above-described embodiments and modified examples, the image

quality improving units

322 and 1622 perform the image quality improving process on the tomographic image using the learned model (image quality improving model) for image quality improvement. On the other hand, the image

quality improving units

322 and 1622 may perform the image quality improving process on other images by using the image quality improving model, and the display control unit 350 displays the various image quality improving images on the display unit. It may be displayed on 50. For example, the image

quality enhancement units

322 and 1622 may perform the image quality enhancement processing on the En-Face image of brightness, the OCTA front image, and the like. Further, the display control unit 350 can cause the display unit 50 to display at least one of the tomographic image, the brightness En-Face image, and the OCTA front image, which have been subjected to the image quality enhancement processing by the image

quality enhancement units

322 and 1622. . The image displayed with high image quality may be an SLO fundus image, a fundus image acquired by a fundus camera (not shown), a fluorescent fundus image, or the like.

Here, the learning data of the image quality enhancement model for performing the image quality enhancement process on various images is the same as the learning data of the image quality enhancement model according to the above-described embodiment and modification regarding various images. The previous image is used as input data, and the image after the high image quality processing is used as output data. Note that, regarding the image quality enhancement processing regarding the learning data, similar to the above-described embodiment and modification, for example, arithmetic averaging processing, processing using a smoothing filter, maximum posterior probability estimation processing (MAP estimation processing), floor It may be a tone conversion process or the like. Further, the image after the high image quality processing may be, for example, an image that has been subjected to filter processing such as noise removal and edge enhancement, or an image whose contrast is adjusted from a low-luminance image to a high-luminance image. May be used. Furthermore, since the output data of the teacher data related to the high image quality model may be a high quality image, it was captured using an OCT device having higher performance than the OCT device when the image as the input data was captured. It may be an image or an image taken with a high load setting.

Also, the image quality improvement model may be prepared for each type of image to be subjected to the image quality improvement processing. For example, a high image quality model for a tomographic image, a high image quality model for an En-Face image of brightness, and a high image quality model for an OCTA front image may be prepared. Further, the image quality improvement model for the En-Face image of luminance and the image quality improvement model for the OCTA front image are learning by comprehensively learning images of different depth ranges with respect to the depth range related to image generation (generation range). It may be a completed model. Images of different depth ranges may include, for example, as shown in FIG. 21A, images of the surface layer (Im2110), the deep layer (Im2120), the outer layer (Im2130), and the choroidal vascular network (Im1940). Further, as the image quality improvement model for the brightness En-Face image and the image quality improvement model for the OCTA front image, a plurality of image quality improvement models obtained by learning images for different depth ranges may be prepared. It should be noted that the image quality improvement model for performing the image quality improvement processing on images other than the tomographic image is not limited to the image quality improvement model for performing different image processing for each region, and the image quality improvement model for performing the same image processing on the entire image is performed. It may be a model.

When preparing a high quality image model for a tomographic image, it may be a learned model that comprehensively learns tomographic images obtained at different positions in the sub-scanning direction (Y-axis direction). The tomographic images Im2151 to Im2153 illustrated in FIG. 21B are examples of tomographic images obtained at different positions in the sub-scanning direction. However, in the case of an image taken at a place where the imaged site (for example, the center of the macula, the center of the optic papilla) is taken, learning may be performed separately for each imaged site, and the imaged site does not matter. You may also learn together. It should be noted that the tomographic image of high image quality may include a tomographic image of brightness and a tomographic image of motion contrast data. However, since the image feature amount differs greatly between the tomographic image of luminance and the tomographic image of motion contrast data, learning may be performed separately for each image quality improvement model.

In this modification, an example will be described in which the display control unit 350 displays an image on which the image

quality improving units

322 and 1622 have performed the image quality improving process on the display unit 50. It should be noted that in the present modification, the description will be given with reference to FIGS. 22A and 22B, but the display screen is not limited to this. The image quality improving process (image quality improving process) can be similarly applied to a display screen in which a plurality of images obtained at different dates and times are displayed side by side as in follow-up observation. Further, the image quality enhancement process can be similarly applied to a display screen such as an image capturing confirmation screen where the examiner confirms whether or not the image capturing is successful immediately after the image capturing. The display control unit 350 can cause the display unit 50 to display a plurality of high-quality images generated by the image-

quality enhancing units

322 and 1622 and low-quality images that have not been enhanced in image quality. In addition, the display control unit 350 selects, for the plurality of high-quality images displayed on the display unit 50 and the low-quality images that have not been enhanced in quality, the low-quality image and the high-quality image selected according to the instruction of the examiner. Can be displayed on the display unit 50. Further, the image processing apparatus can also output the low-quality image and the high-quality image selected according to the instruction of the examiner to the outside.

22A and 22B, an example of the display screen 2200 of the interface according to this modification will be shown below. The display screen 2200 shows the entire screen, and the display screen 2200 shows a patient tab 2201, an imaging tab 2202, a report tab 2203, and a setting tab 2204. Further, the diagonal lines in the report tab 2203 represent the active state of the report screen. In this modification, an example of displaying a report screen will be described.

The report screen shown in FIG. 22A shows an SLO fundus image Im2205, OCTA front images Im2207, Im2208, a luminance En-Face image Im2209, tomographic images Im2211, Im2212, and a button 2220. Further, an OCTA front image Im2206 corresponding to the OCTA front image Im2207 is superimposed and displayed on the SLO fundus image Im2205. Furthermore, the

boundary lines

2213 and 2214 of the depth ranges of the OCTA front images Im2207 and Im2208 are superimposed and displayed on the tomographic images Im2211 and Im2212, respectively. The button 2220 is a button for designating execution of the high image quality processing. The button 2220 may be a button for instructing to display a high quality image, as described later.

In this modified example, the image quality improvement process is executed by designating the button 2220 or based on the information stored (stored) in the database. First, an example in which the display of a high-quality image and the display of a low-quality image are switched by designating the button 2220 according to an instruction from the examiner will be described. Note that the target image for the high image quality processing will be described below as an OCTA front image.

Note that the depth range of the OCTA front images Im2207 and Im2208 may be determined using information of the retinal layer detected by the above-described conventional segmentation processing or segmentation processing using a trained model. The depth range may be, for example, a range between two layer boundaries regarding the detected retinal layer, or a predetermined depth direction or a shallower direction based on one of the two layer boundaries regarding the detected retinal layer. The number of pixels may be included in the range. Further, the depth range may be, for example, a range that is changed (offset) in accordance with an operator's instruction from a range between two layer boundaries regarding the detected retinal layer.

When the inspector specifies the report tab 2203 and transitions to the report screen, the display control unit 350 displays the OCTA front images Im2207 and Im2208 of low image quality. After that, when the examiner designates the button 2220, the image

quality improving units

322 and 1622 perform the image quality improving process on the OCTA front images Im2207 and Im2208 displayed on the screen. After the image quality improving process is completed, the display control unit 350 displays the high image quality images generated by the image

quality enhancing units

322 and 1622 on the report screen. Since the OCTA front image Im2206 is a display in which the OCTA front image Im2207 is superimposed and displayed on the SLO fundus image Im2205, the display control unit 350 can also display an image that has been subjected to high image quality processing for the OCTA front image Im2206. it can. Further, the display control unit 350 can change the display of the button 2220 to the active state so that it can be seen that the image quality improving process has been executed.

Here, the execution of the processing in the image

quality improving units

322 and 1622 does not have to be limited to the timing when the examiner specifies the button 2220. Since the types of the OCTA front images Im2207 and Im2208 to be displayed when opening the report screen are known in advance, the image

quality improving units

322 and 1622 perform image quality improvement processing when the displayed screen transitions to the report screen. You may execute. Then, the display control unit 350 may display the high-quality image on the report screen when the button 2220 is pressed. Furthermore, it is not necessary that there be two types of images for which the high image quality processing is performed in response to an instruction from the examiner or when transitioning to the report screen. An image that is likely to be displayed, for example, a plurality of OCTA front images such as a surface layer (Im2110), a deep layer (Im2120), an outer layer (Im2130), and a choroidal vascular network (Im2140) shown in FIG. 21A is processed. It may be performed. In this case, the image subjected to the high image quality processing may be temporarily stored in a memory or a database.

Next, a case where the image quality improvement processing is executed based on the information stored (recorded) in the database will be described. When a state in which the image quality improvement processing is executed is stored in the database, the display quality is controlled by the image

quality improvement units

322 and 1622 when the transition to the report screen is performed. The unit 350 causes the display unit 50 to display by default. Then, the display control unit 350 displays the button 2220 in the active state by default so that the examiner can see that the high-quality image obtained by executing the high-quality processing is displayed. Can be configured. When the examiner wants to display the low-quality image before the high-quality processing, the display control unit 350 causes the display unit 50 to display the low-quality image by canceling the active state by designating the button 2220. You can At this time, if the examiner wants to return the displayed image to the high-quality image, he or she specifies the button 2220 to activate it, and the display control unit 350 causes the display unit 50 to display the high-quality image again. Let

Whether to execute high image quality processing on the database is specified for each layer, such as common to all data stored in the database and for each shooting data (each inspection). For example, when the state in which the image quality enhancement process is performed on the entire database is saved, the state in which the examiner does not perform the image quality enhancement process is saved for individual imaging data (individual inspection). be able to. In this case, it is possible to display the individual imaged data in which the state in which the high image quality processing is not executed is stored in the state in which the high image quality processing is not executed when displaying next time. With such a configuration, if it is not designated whether or not to execute the image quality enhancement process in units of image data (inspection unit), the process can be performed based on the information designated for the entire database. it can. Further, in the case where the image data is designated by the image data unit (inspection unit), the processing can be individually executed based on the information.

Note that a user interface (not shown) (for example, a save button) may be used to save the execution state of the high image quality processing for each image data (each inspection). Also, when transitioning to other imaging data (other examination) or other patient data (for example, changing to a display screen other than the report screen in response to an instruction from the examiner), the display state (for example, button 2220 The state in which the image quality enhancement processing is executed may be stored based on the (state).

In this modification, an OCTA front image Im2207, Im2208 is displayed as the OCTA front image, but the OCTA front image to be displayed can be changed by the examiner's designation. Therefore, the change of the image to be displayed when the execution of the high image quality processing is designated (the button 2220 is in the active state) will be described.

The image to be displayed can be changed using a user interface (not shown) (for example, a combo box). For example, when the examiner changes the type of image from the surface layer to the choroidal vascular network, the image

quality improving units

322 and 1622 perform the image quality improving process on the choroidal vascular network image, and the display control unit 350 sets the high image quality. The high-quality image generated by the image

quality conversion units

322 and 1622 is displayed on the report screen. That is, the display control unit 350 displays the high-quality image in the first depth range in accordance with the instruction from the examiner, and displays the high-quality image in the second depth range that is at least partially different from the first depth range. You may change to the display of an image. At this time, the display control unit 350 changes the first depth range to the second depth range in response to an instruction from the examiner, thereby displaying the high-quality image in the first depth range to the second depth range. You may change to the display of the high quality image of the depth range of. As described above, for the image that is likely to be displayed at the time of transition of the report screen, if the high quality image has already been generated, the display control unit 350 may display the generated high quality image. .

Further, the method of changing the image type is not limited to the above-described one, but the OCTA front image in which different depth ranges are set by changing the reference layer and the offset value is generated, and the image quality improvement processing is performed on the generated OCTA front image. It is also possible to display a high-quality image obtained by executing. In that case, when the reference layer or the offset value is changed, the image

quality improving units

322 and 1622 execute the image quality improving process on an arbitrary OCTA front image, and the display control unit 350 displays the high image quality image. Display on the report screen. The reference layer and the offset value can be changed using a user interface (not shown) (for example, a combo box or a text box). In addition, the depth range (generation range) of the OCTA front image can be changed by dragging (moving the layer boundary) any one of the

boundary lines

2213 and 2214 that are superimposed and displayed on the tomographic images Im2211 and Im2212, respectively. .

When changing the boundary line by dragging, execution instructions for image quality improvement processing are continuously executed. Therefore, the image

quality improving units

322 and 1622 may always process the execution command, or may execute the command after changing the layer boundary by dragging. Alternatively, the execution of the high image quality processing is instructed continuously, but the previous instruction may be canceled and the latest instruction may be executed when the next instruction comes.

Note that the image quality improvement process may take a relatively long time. Therefore, it may take a relatively long time to display a high-quality image regardless of the timing at which the instruction is executed. Therefore, after the depth range for generating the OCTA front image is set in accordance with the instruction from the examiner and before the high-quality image is displayed, the low-quality OCTA corresponding to the set depth range is displayed. The front image (low quality image) may be displayed. That is, when the depth range is set, a low-quality OCTA front image (low-quality image) corresponding to the set depth range is displayed, and when the high image quality processing is completed, the low-quality OCTA front image is displayed. The display may be changed to the display of a high quality image. In addition, information indicating that the image quality enhancement process is being performed may be displayed from the setting of the depth range to the display of the high quality image. Note that these processes are not limited to the configuration applied when it is premised that the image quality enhancement process is already designated (the button 2220 is in the active state). For example, when the execution of the high image quality processing is instructed according to the instruction from the examiner, these processing can be applied until the high quality image is displayed.

In this modification, the OCTA front images Im2207 and Im2108 relating to different layers are displayed as the OCTA front images, and low-quality and high-quality images are switched and displayed, but the displayed image is not limited to this. . For example, a low image quality OCTA front image may be displayed as the OCTA front image Im2207, and a high image quality OCTA front image may be displayed as the OCTA front image Im2208. When the images are switched and displayed, the images are switched at the same place, so that it is easy to compare the changed portions, and when the images are displayed side by side, the images can be displayed simultaneously, so that the entire images are easily compared.

Next, using FIG. 22A and FIG. 22B, the execution of the image quality improvement process in the screen transition will be described. FIG. 22B shows a screen example in which the OCTA front image Im2207 in FIG. 22A is enlarged and displayed. Also in the screen example shown in FIG. 22B, the button 2220 is displayed as in FIG. 22A. The screen transition from the screen of FIG. 22A to the screen of FIG. 22B is transitioned by, for example, double-clicking the OCTA front image Im2207, and transits from the screen of FIG. 22B to the screen of FIG. 22A by the close button 2230. The screen transition is not limited to the method shown here, and a user interface (not shown) may be used.

If execution of high image quality processing is specified during screen transition (button 2220 is active), that state is maintained even during screen transition. That is, when transitioning to the screen of FIG. 22B while the high-quality image is being displayed on the screen of FIG. 22A, the high-quality image is also displayed on the screen of FIG. 22B. Then, the button 2220 is activated. The same applies to the case of transition from the screen of FIG. 22B to the screen of FIG. 22A. In FIG. 22B, the button 2220 can be designated to switch the display to a low quality image.

Regarding the screen transition, not only the screen shown here but also a display screen for follow-up observation, a display screen for panorama, and the like for displaying the same image data, the high-quality image display state is maintained. The transition can be performed as it is. That is, an image corresponding to the state of the button 2220 on the display screen before transition can be displayed on the display screen after transition. For example, if the button 2220 on the display screen before the transition is in the active state, a high quality image is displayed on the display screen after the transition. Further, for example, if the active state of the button 2220 on the display screen before the transition is released, the low image quality image is displayed on the display screen after the transition. Note that when the button 2220 on the follow-up observation display screen is activated, a plurality of images obtained at different dates and times (different examination dates) displayed side by side on the follow-up observation display screen are switched to high-quality images. May be. That is, when the button 2220 on the display screen for follow-up observation is activated, the button 2220 may be collectively reflected on a plurality of images obtained at different dates and times.

Note that Fig. 23 shows an example of a display screen for follow-up observation. When the tab 2301 is selected according to an instruction from the examiner, a display screen for follow-up observation is displayed as shown in FIG. At this time, the depth range of the OCTA front image can be changed by selecting a set desired by the examiner from the default depth range set displayed in the list boxes 2302 and 2303. For example, the surface layer of the retina is selected in the list box 2302, and the deep layer of the retina is selected in the list box 2303. The analysis result of the OCTA front image of the retinal surface layer is displayed in the upper display area, and the analysis result of the OCTA front image of the deep retinal layer is displayed in the lower display area. When the depth range is selected, it is collectively changed to the parallel display of the analysis results of the plurality of OCTA front images of the selected depth range for the plurality of images at different dates and times.

At this time, if the analysis result display is deselected, it may be collectively changed to a parallel display of a plurality of OCTA front images at different dates and times. Then, when the button 2220 is designated in response to the instruction from the examiner, the display of the plurality of OCTA front images is collectively changed to the display of the plurality of high-quality images.

Further, when the display of the analysis result is in the selected state and the button 2220 is designated in response to the instruction from the examiner, the analysis results of the plurality of OCTA front images are displayed and the analysis results of the plurality of high quality images are displayed. Will be changed all at once. Here, the analysis result may be displayed by superimposing the analysis result on the image with arbitrary transparency. At this time, the change from the display of the image to the display of the analysis result may be, for example, a change in a state in which the analysis result is superimposed on the displayed image with an arbitrary transparency. Further, the change from the display of the image to the display of the analysis result may be, for example, a change to the display of an image (for example, a two-dimensional map) obtained by blending the analysis result and the image with arbitrary transparency. Good.

Also, the type of layer boundary and the offset position used to specify the depth range can be collectively changed from the

user interfaces

2305 and 2306. It should be noted that the

user interfaces

2305 and 2306 for changing the type of layer boundary and the offset position are examples, and any other interface may be used. The tomographic image is also displayed together, and the layer boundary data superimposed on the tomographic image is moved in response to an instruction from the examiner, thereby collectively changing the depth range of a plurality of OCTA front images at different dates and times. You may. At this time, if a plurality of tomographic images of different dates and times are displayed side by side and the above-mentioned movement is performed on one tomographic image, the layer boundary data may be similarly moved on another tomographic image.

Also, the presence / absence of the image projection method and the projection artifact suppression processing may be changed by, for example, selecting from a user interface such as a context menu.

Alternatively, the selection button 2307 may be selected to display a selection screen (not shown), and the image selected from the image list displayed on the selection screen may be displayed. The arrow 2304 displayed at the upper part of FIG. 23 is a mark indicating that the examination is currently selected, and the reference examination (Baseline) is the examination selected in the follow-up imaging (see FIG. 23). It is the image on the left side). Of course, a mark indicating the reference inspection may be displayed on the display unit.

If the “Show Difference” check box 2308 is specified, the measurement value distribution (map or sector map) for the reference image is displayed on the reference image. Further, in this case, in a region corresponding to the other inspection date, a difference measurement value map between the measurement value distribution calculated for the reference image and the measurement value distribution calculated for the image displayed in the region. Is displayed. As the measurement result, a trend graph (a graph of measured values for images on each inspection day obtained by measuring change over time) may be displayed on the report screen. That is, time series data (for example, a time series graph) of a plurality of analysis results corresponding to a plurality of images at different dates and times may be displayed. At this time, analysis results related to dates and times other than the dates and times corresponding to the displayed images are also distinguishable from the analysis results corresponding to the displayed images (for example, time series). The color of each point on the graph may differ depending on whether or not an image is displayed). Further, a regression line (curve) of the trend graph or a corresponding mathematical expression may be displayed on the report screen.

In this modification, the OCTA front image has been described, but the image to which the process according to this modification is applied is not limited to this. The image relating to processing such as display, image quality improvement, and image analysis according to the present modification may be an En-Face image of luminance. Further, not only the En-Face image but also a different image such as a tomographic image by B-scan, an SLO fundus image, a fundus image, or a fluorescent fundus image may be used. In that case, the user interface for executing the high image quality processing is to instruct execution of the high image quality processing for a plurality of images of different types, and select an arbitrary image from the plurality of images of different types. There may be an instruction to execute the image quality enhancement process.

For example, when displaying a tomographic image by B-scan with high image quality, the tomographic images Im2211, Im2212 shown in FIG. 22A may be displayed with high image quality. Further, a high-quality tomographic image may be displayed in the region where the OCTA front images Im2207 and Im2208 are displayed. Note that only one tomographic image having a high image quality and displayed may be displayed, or a plurality of tomographic images may be displayed. When a plurality of tomographic images are displayed, the tomographic images acquired at different positions in the sub-scanning direction may be displayed, or, for example, a plurality of tomographic images obtained by cross scanning or the like may be displayed with high image quality. When displaying, images in different scanning directions may be displayed respectively. When displaying a plurality of tomographic images obtained by, for example, a radial scan with high image quality, a plurality of partially selected tomographic images (for example, two tomographic images at positions symmetrical to each other with respect to a reference line). ) May be displayed respectively. Further, a plurality of tomographic images are displayed on a display screen for follow-up observation as shown in FIG. 23, and an instruction for image quality improvement and an analysis result (for example, the thickness of a specific layer, etc.) are obtained by the same method as the above method. ) May be displayed. Further, the image quality improving process may be performed on the tomographic image based on the information stored in the database by the same method as the above method.

Similarly, when the SLO fundus image is displayed with high image quality, for example, the SLO fundus image Im2205 may be displayed with high image quality. Furthermore, when displaying an En-Face image of luminance with high image quality, for example, the En-Face image Im2209 of luminance may be displayed with high image quality. Furthermore, a plurality of SLO fundus images and En-Face images of brightness are displayed on a display screen for follow-up observation as shown in FIG. , Specific layer thickness, etc.) may be displayed. In addition, the image quality enhancement process may be performed on the SLO fundus image or the En-Face image of the brightness based on the information stored in the database by the same method as the above method. It should be noted that the display of the tomographic image, the SLO fundus image, and the luminance En-Face image is merely an example, and these images may be displayed in any manner depending on the desired configuration. Further, at least two or more of the OCTA front image, the tomographic image, the SLO fundus image, and the luminance En-Face image may be displayed with high image quality by a single instruction.

With such a configuration, the display control unit 350 can display the image that has been subjected to the image quality enhancement processing by the image

quality enhancement units

322 and 1622 according to the present modification on the display unit 50. At this time, as described above, when at least one of the plurality of conditions regarding the display of the high-quality image, the display of the analysis result, the depth range of the displayed front image, and the like is selected, the display screen The selected state may be maintained even when is changed.

Further, as described above, when at least one of the plurality of conditions is in the selected state, the selected state of the at least one is maintained even if the other condition is changed to the selected state. May be. For example, when the display of the analysis result is in the selected state, the display control unit 350 raises the display of the analysis result of the low-quality image according to the instruction from the examiner (for example, when the button 2220 is designated). You may change to the display of the analysis result of a quality image. In addition, the display control unit 350 displays the analysis result of the high-quality image in response to an instruction from the examiner (for example, when the designation of the button 2220 is canceled) when the analysis result display is in the selected state. May be changed to the display of the analysis result of the low quality image.

Further, the display control unit 350, in the case where the display of the high quality image is in the non-selected state, responds to the instruction from the examiner (for example, when the designation of the display of the analysis result is canceled), the display control unit 350 displays the low quality image. The display of the analysis result may be changed to the display of the low quality image. Further, when the display of the high quality image is in the non-selected state, the display control unit 350 displays the low quality image according to the instruction from the examiner (for example, when the display of the analysis result is designated). You may change to the display of the analysis result of a low quality image. In addition, the display control unit 350 analyzes the high-quality image in response to an instruction from the examiner (for example, when the display of the analysis result is canceled) when the display of the high-quality image is in the selected state. The display of the result may be changed to the display of a high quality image. In addition, when the display of the high-quality image is in the selected state, the display control unit 350 raises the display of the high-quality image according to the instruction from the examiner (for example, when the display of the analysis result is designated). You may change to the display of the analysis result of a quality image.

Also, consider the case where the display of the high-quality image is in the non-selected state and the display of the analysis result of the first type is in the selected state. In this case, the display control unit 350 displays the analysis result of the first type of the low-quality image in response to the instruction from the examiner (for example, when the display of the analysis result of the second type is designated). The display may be changed to the display of the analysis result of the second type of the low image quality image. Also, consider a case where the display of the high-quality image is in the selected state and the display of the analysis result of the first type is in the selected state. In this case, the display control unit 350 displays the analysis result of the first type of the high-quality image in response to the instruction from the examiner (for example, when the display of the analysis result of the second type is designated). The display may be changed to the display of the analysis result of the second type of high quality image.

Note that, on the display screen for follow-up observation, as described above, these display changes may be collectively reflected in a plurality of images obtained at different dates and times. Here, the analysis result may be displayed by superimposing the analysis result on the image with arbitrary transparency. At this time, the display of the analysis result may be changed, for example, to a state in which the analysis result is superimposed on the displayed image with arbitrary transparency. The change to the display of the analysis result may be, for example, a change to the display of an image (for example, a two-dimensional map) obtained by blending the analysis result and the image with arbitrary transparency.

In this modification, the image

quality improvement units

322 and 1622 use the image quality improvement model to generate a high quality image in which the image quality of the tomographic image is improved. However, the constituent elements that generate a high-quality image using the high-quality image model are not limited to the high-

quality image units

322 and 1622. For example, a second image quality improving unit different from the image

quality improving units

322 and 1622 may be provided, and the second image quality improving unit may generate a high quality image using the image quality improving model. In this case, the second image quality improving unit is not the high quality image subjected to different image processing for each region using the learned model, but the high quality image subjected to the same image processing for the entire image. May be generated. At this time, the output data of the learned model may be an image in which the same image quality improving process is performed on the entire image. The second image quality improving unit and the image quality improving model used by the second image quality improving unit may be configured by a software module executed by a processor such as a CPU, MPU, GPU, or FPGA. It may be configured by a circuit that performs a specific function such as an ASIC.

(Modification 9)
The display control unit 350 can cause the display unit 50 to display an image selected from the high-quality images generated by the high-quality

image generation units

322 and 1622 and the input image, according to an instruction from the examiner. Further, the display control unit 350 may switch the display on the display unit 50 from the captured image (input image) to the high-quality image in response to an instruction from the examiner. That is, the display control unit 350 may change the display of the low image quality image to the display of the high image quality image in response to an instruction from the examiner. Further, the display control unit 350 may change the display of the high quality image to the display of the low quality image in response to an instruction from the examiner.

Further, the image

quality improvement units

322 and 1622 execute the image quality improvement process using the image quality improvement model (input of the image to the image quality improvement model) according to the instruction from the examiner, and the display control unit The display unit 50 may display the generated high-quality image on the display unit 50. On the other hand, when the image capturing device (image capturing unit 20) captures an input image, the image

quality improving units

322 and 1622 automatically generate a high image quality image based on the input image using the image quality enhancing model, The display control unit 350 may display the high-quality image on the display unit 50 in response to an instruction from the examiner.

Note that these processes can be performed for the output of analysis results as well. That is, the display control unit 350 may change the display of the analysis result of the low-quality image to the display of the analysis result of the high-quality image in response to the instruction from the examiner. In addition, the display control unit 350 may change the display of the analysis result of the high-quality image to the display of the analysis result of the low-quality image according to an instruction from the examiner. Furthermore, the display control unit 350 may change the display of the analysis result of the low image quality image to the display of the low image quality image in response to an instruction from the examiner. Further, the display control unit 350 may change the display of the low image quality image to the display of the analysis result of the low image quality image in response to an instruction from the examiner. Further, the display control unit 350 may change the display of the analysis result of the high quality image to the display of the high quality image in accordance with the instruction from the examiner. Further, the display control unit 350 may change the display of the high-quality image to the display of the analysis result of the high-quality image in response to the instruction from the examiner.

Furthermore, the display control unit 350 may change the display of the analysis result of the low image quality image to the display of the analysis result of another type of the low image quality image in response to an instruction from the examiner. In addition, the display control unit 350 may change the display of the analysis result of the high-quality image to the display of the analysis result of another type of the high-quality image according to the instruction from the examiner.

Here, the analysis result of the high quality image may be displayed by superimposing the analysis result of the high quality image on the high quality image with arbitrary transparency. The analysis result of the low image quality image may be displayed by superimposing the analysis result of the low image quality image on the low image quality image with arbitrary transparency. At this time, the display of the analysis result may be changed, for example, to a state in which the analysis result is superimposed on the displayed image with arbitrary transparency. The change to the display of the analysis result may be, for example, a change to the display of an image (for example, a two-dimensional map) obtained by blending the analysis result and the image with arbitrary transparency.

In this modification, the image

quality improvement units

quality image units

quality improving units

Further, in the modified example 8, an image which has been subjected to the image quality enhancement process using the image quality enhancement model is displayed according to the active state of the button 2220 on the display screen. On the other hand, the analysis value using the result of the segmentation processing using the learned model may be displayed according to the active state of the button 2220. In this case, for example, when the button 2220 is in the inactive state (the segmentation process using the learned model is in the non-selected state), the display control unit 350 causes the display unit 50 to display the analysis result using the result of the segmentation process. Display it. On the other hand, when the button 2220 is activated, the display control unit 350 causes the display unit 50 to display the analysis result using the result of the segmentation process using the learned model.

With such a configuration, the analysis result using the result of the segmentation process without using the learned model and the analysis result using the result of the segmentation process using the learned model are switched and displayed according to the active state of the button. To be done. These analysis results are based on the results of the processing by the learned model and the image processing by the rule base, respectively, and thus there may be a difference between the results. Therefore, by switching and displaying these analysis results, the examiner can compare the two and use a more convincing analysis result for diagnosis.

Note that when the segmentation processing is switched, for example, when the displayed image is a tomographic image, the numerical value of the layer thickness analyzed for each layer may be switched and displayed. Further, for example, when a tomographic image that is divided for each layer by color, a hatching pattern, or the like is displayed, the tomographic images in which the shape of the layer is changed may be switched and displayed according to the result of the segmentation processing. Further, when the thickness map is displayed as the analysis result, the thickness map in which the color indicating the thickness is changed according to the result of the segmentation process may be displayed. Further, the button for designating the high image quality processing and the button for designating the segmentation processing using the learned model may be provided separately, or either one may be provided, or both buttons may be provided. It may be provided as one button.

Also, the switching of the segmentation process may be performed based on the information stored (recorded) in the database, similarly to the switching of the image quality enhancement process described above. Regarding the processing at the time of screen transition, the switching of the segmentation processing may be performed in the same manner as the switching of the image quality improvement processing described above.

(Modification 10)
The display control unit 350 in the various embodiments and modifications described above may display the analysis result of the layer thickness of the desired layer, various blood vessel densities, etc. on the report screen of the display screen. Further, optic disc, macula, vascular region, nerve fiber bundle, vitreous region, macula region, choroid region, sclera region, lamina cribrosa region, retinal layer boundary, retinal layer boundary end, photoreceptor cell, blood cell, The value (distribution) of the parameter regarding the site of interest including at least one of a blood vessel wall, a blood vessel inner wall boundary, a blood vessel outer boundary, a ganglion cell, a corneal region, a corner region, and Schlemm's canal may be displayed as an analysis result. At this time, for example, by analyzing a medical image to which various types of artifact reduction processing are applied, it is possible to display an accurate analysis result. Note that the artifacts are, for example, false image areas caused by light absorption by blood vessel areas, projection artifacts, band-like artifacts in the front image generated in the main scanning direction of the measurement light due to the state of the subject's eye (movement, blinking, etc.), and the like. It may be. Further, the artifact may be any artifact region as long as it occurs randomly on the medical image of the predetermined region of the subject every time the image is captured. Further, the display control unit 350 may cause the display unit 50 to display the value (distribution) of the parameter regarding the area including at least one of the various artifacts (impairment area) as described above as the analysis result. In addition, parameter values (distributions) relating to a region including at least one of drusen, new blood vessels, vitiligo (hard vitiligo), and abnormal sites such as pseudo-drussen may be displayed as the analysis result. The image analysis process may be performed by the analysis unit 1924 or may be performed by an analysis unit different from the analysis unit 1924. Further, the image on which the image analysis is performed may be an image with high image quality or an image without high image quality.

Also, the analysis result may be displayed in an analysis map, a sector indicating a statistical value corresponding to each divided area, or the like. As the analysis result, a learned model (analysis result generation engine, a learned model for analysis result generation) obtained by the analysis unit 1924 or another analysis unit learning the analysis result of the medical image as learning data is used. It may be generated by At this time, the learned model uses learning data including a medical image and an analysis result of the medical image, learning data including a medical image and an analysis result of a medical image of a different type from the medical image, and the like. It may be obtained from

Also, the learning data may include the area label image generated by the segmentation process and the analysis result of the medical image using the area label image. In this case, the

image processing units

320, 1620, and 1920 use, for example, the learned model for generating the analysis result to execute the segmentation processing (for example, the detection result of the retinal layer) to obtain a tomographic image. It can function as an example of an analysis result generation unit that generates an analysis result. In other words, the

image processing units

320, 1620, 1920 are different from the learned model for generating the high-quality image (second medical image), and the learned model for generating the analysis result (fourth learned model). Can be used to generate image analysis results for each of the different regions identified by the segmentation process.

Further, the learned model is obtained by learning using learning data including input data in which a plurality of medical images of different types of predetermined regions are set, such as a brightness front image and a motion contrast front image. Good. Here, the luminance front image corresponds to the luminance En-Face image, and the motion contrast front image corresponds to the OCTA En-Face image.

Also, the analysis result obtained by using the high quality image generated by using the learned model for high image quality may be displayed. In this case, the input data included in the learning data may be a high quality image generated by using a learned model for high image quality, or may be a set of a low quality image and a high quality image. Good. Note that the learning data may be an image in which at least a part of the image whose image quality has been improved by using the learned model is manually or automatically corrected.

Further, the learning data is, for example, at least an analysis value obtained by analyzing the analysis area (for example, an average value or a median value), a table including the analysis value, an analysis map, a position of the analysis area such as a sector in the image, and the like. Information including one item may be data obtained by labeling (annotating) the input data as correct answer data (learning with a teacher). Note that the analysis result obtained by using the learned model for generating the analysis result may be displayed in response to the instruction from the operator.

Further, the display control unit 350 in the above-described embodiments and modifications may display various diagnostic results such as glaucoma and age-related macular degeneration on the report screen of the display screen. At this time, for example, an accurate diagnostic result can be displayed by analyzing a medical image to which the above-described various artifact reduction processes are applied. Further, as the diagnosis result, the position of the identified abnormal part or the like may be displayed on the image, or the state of the abnormal part or the like may be displayed by characters or the like. Furthermore, a classification result of abnormal parts (for example, Curtin classification) may be displayed as a diagnosis result. Further, as the classification result, for example, information indicating the probability of each abnormal part (for example, a numerical value indicating a ratio) may be displayed. Further, information necessary for the doctor to confirm the diagnosis may be displayed as the diagnosis result. As the necessary information, for example, advice such as additional photographing can be considered. For example, when an abnormal portion is detected in the blood vessel region in the OCTA image, it may be displayed that additional fluorescence imaging using a contrast agent that allows more detailed blood vessel observation than OCTA is performed.

The diagnosis result is obtained by using the learned model (diagnosis result generation engine, learned model for diagnosis result generation) obtained by the

control unit

30, 1600, 1900 learning the diagnosis result of the medical image as learning data. It may be generated. Further, the learned model is obtained by learning using learning data including a medical image and a diagnosis result of the medical image, learning data including a medical image and a diagnosis result of a medical image of a different type from the medical image, and the like. It may be obtained.

Further, the learning data may include the region label image generated by the segmentation process and the diagnostic result of the medical image using the region label image. In this case, the

image processing units

320, 1620, 1920 use, for example, a learned model for generating a diagnostic result to execute a segmentation process (for example, a detection result of the retinal layer) to obtain a tomographic image. It can function as an example of a diagnostic result generation unit that generates a diagnostic result. In other words, the

image processing units

320, 1620, 1920 are different from the learned model for generating the high-quality image (second medical image), and the learned model for generating the diagnostic result (fifth learned model). Can be used to generate diagnostic results for each of the different regions identified by the segmentation process.

Further, the diagnosis result obtained by using the high quality image generated by using the learned model for high image quality may be displayed. In this case, the input data included in the learning data may be a high quality image generated by using a learned model for high image quality, or may be a set of a low quality image and a high quality image. Good. Note that the learning data may be an image in which at least a part of the image whose image quality has been improved by using the learned model is manually or automatically corrected.

In addition, the learning data includes, for example, the diagnosis name, the type and state (degree) of the lesion (abnormal site), the position of the lesion in the image, the position of the lesion with respect to the region of interest, findings (interpretation findings, etc.), and the basis of the diagnosis name (affirmation). Input information is labeled (annotated) as correct answer data (for supervised learning) that contains at least one of the following: medical support information), grounds for denying a diagnosis name (negative medical support information), etc. It may be data. The diagnosis result obtained by using the learned model for generating the diagnosis result may be displayed in response to the instruction from the examiner.

In addition, the display control unit 350 according to the above-described various embodiments and modified examples uses the report screen of the display screen, the object recognition result (object detection result) and the segmentation such as the above-described attention site, artifact, and abnormal site. You may display the result. At this time, for example, a rectangular frame or the like may be superimposed and displayed around the object on the image. Further, for example, a color or the like may be superimposed and displayed on the object in the image. It should be noted that the object recognition result and the segmentation result are learned models (object recognition engine, object recognition engine, for object recognition) obtained by learning the learning data obtained by labeling (annotation) the medical image with the information indicating the object recognition and the segmentation as correct data. A trained model, a segmentation engine, and a trained model for segmentation) may be used. The analysis result generation and the diagnosis result generation described above may be obtained by using the object recognition result and the segmentation result described above. For example, the analysis result generation and the diagnosis result generation may be performed on the part of interest obtained by the object recognition and the segmentation processing.

When detecting an abnormal part, the

image processing units

320, 1620, and 1920 may use a hostile generation network (GAN: General Adversary Networks) or a variational auto encoder (VAE: Various Auto-Encoder). . For example, a DCGAN (Deep Convolutional GAN) including a generator obtained by learning generation of a tomographic image and a discriminator obtained by learning discrimination between a new tomographic image generated by the generator and a real frontal fundus image. ) Can be used as a machine learning model.

When DCGAN is used, for example, the discriminator encodes the input tomographic image as a latent variable, and the generator generates a new tomographic image based on the latent variable. Then, the difference between the input tomographic image and the generated new tomographic image can be extracted as the abnormal portion. When VAE is used, for example, the input tomographic image is encoded by an encoder to be a latent variable, and the latent variable is decoded by a decoder to generate a new tomographic image. Then, the difference between the input tomographic image and the generated new tomographic image can be extracted as the abnormal portion. Although a tomographic image has been described as an example of the input data, a fundus image or a front image of the anterior eye may be used.

Furthermore, the

image processing units

320, 1620, and 1920 may detect an abnormal portion using a convolutional auto encoder (CAE: Conventional Auto-Encoder). When CAE is used, the same image is learned as input data and output data during learning. Thus, when an image having an abnormal portion is input to the CAE at the time of estimation, an image having no abnormal portion is output according to the learning tendency. Then, the difference between the image input to the CAE and the image output from the CAE can be extracted as the abnormal portion. Also in this case, not only the tomographic image but also the fundus image, the front image of the anterior eye, etc. may be used as the input data.

In these cases, the

image processing units

320, 1620, and 1920 input the medical images obtained by using the adversarial generation network or the auto-encoder for each of the different regions specified by the segmentation processing and the like to the adversarial generation network or the auto-encoder. It is possible to generate information regarding a difference from the obtained medical image as information regarding the abnormal part. As a result, the

image processing units

320, 1620, 1920 can be expected to detect abnormal parts at high speed and with high accuracy. Here, the auto encoder includes VAE, CAE, and the like.

Also, in the diseased eye, the image characteristics differ depending on the type of disease. Therefore, the learned models used in the various embodiments and modifications described above may be generated and prepared for each type of disease or each abnormal site. In this case, for example, the image processing unit 320 can select a learned model to be used for the processing according to the input (instruction) of the type of disease of the eye to be inspected, the abnormal site, or the like from the operator. Note that the learned model prepared for each type of disease or abnormal site is not limited to the learned model used for detection of the retinal layer or generation of the region label image, and for example, an engine for image evaluation or analysis. It may be a trained model used in the engine of the above. At this time, the

image processing units

320, 1620, and 1920 may identify the type of disease or abnormal site of the eye to be inspected from the image using a separately prepared learned model. In this case, the

image processing units

320, 1620, 1920 automatically generate the learned model used for the above-mentioned processing based on the type of disease or the abnormal part identified by using the separately prepared learned model. You can choose. Note that the learned model for identifying the type of disease or abnormal part of the eye to be examined is the input of a tomographic image or fundus image, and the learning data of the type of disease or abnormal part in these images as output data. You may learn using a pair. Here, as the input data of the learning data, a tomographic image, a fundus image or the like may be used alone as the input data, or a combination thereof may be used as the input data.

In addition, particularly, the learned model for generating the diagnostic result may be a learned model obtained by learning with learning data including input data in which a plurality of medical images of different types of predetermined regions of the subject are set. Good. At this time, as the input data included in the learning data, for example, input data in which a motion contrast front image of the fundus and a luminance front image (or luminance tomographic image) are set can be considered. Further, as the input data included in the learning data, for example, input data in which a tomographic image (B scan image) of the fundus and a color fundus image (or a fluorescent fundus image) are set is also considered. Further, the plurality of medical images of different types may be anything acquired by different modalities, different optical systems, different principles, or the like.

Further, the learned model for generating the diagnostic result may be a learned model obtained by learning with learning data including input data in which a plurality of medical images of different parts of the subject are set. At this time, as the input data included in the learning data, for example, input data in which a tomographic image of the fundus (B scan image) and a tomographic image of the anterior segment (B scan image) are set can be considered. In addition, as input data included in the learning data, for example, input data including a set of a three-dimensional OCT image (three-dimensional tomographic image) of the macula of the fundus and a circle scan (or raster scan) tomographic image of the optic disc of the fundus, and the like. Can also be considered.

The input data included in the learning data may be a plurality of medical images of different parts of the subject and different types. At this time, the input data included in the learning data may be, for example, input data in which a tomographic image of the anterior segment and a color fundus image are set. Further, the learned model described above may be a learned model obtained by learning with learning data including input data in which a plurality of medical images of different imaging fields of view of a predetermined region of the subject are set. Further, the input data included in the learning data may be a combination of a plurality of medical images obtained by time-dividing a predetermined region into a plurality of regions, such as a panoramic image. At this time, by using a wide-angle image such as a panoramic image as learning data, there is a possibility that the feature amount of the image can be acquired with high accuracy because the amount of information is larger than that of the narrow-angle image. The result of can be improved. For example, at the time of estimation (at the time of prediction), when abnormal parts are detected at a plurality of positions in the wide-angle image, enlarged images of the abnormal parts can be sequentially displayed. This allows the abnormal parts at a plurality of positions to be efficiently confirmed, so that the convenience of the examiner can be improved, for example. At this time, for example, the examiner may be configured to select each position on the wide-angle image in which the abnormal portion is detected, and the enlarged image of the abnormal portion at the selected position may be displayed. . Further, the input data included in the learning data may be input data in which a plurality of medical images at different dates and times of a predetermined part of the subject are set as a set.

Also, the display screen on which at least one of the above-mentioned analysis result, diagnosis result, object recognition result, and segmentation result is displayed is not limited to the report screen. Such a display screen is, for example, at least one display screen such as a shooting confirmation screen, a display screen for follow-up observation, and a preview screen for various adjustments before shooting (display screen on which various live moving images are displayed). May be displayed in. For example, by displaying the at least one result obtained using the learned model described above on the shooting confirmation screen, the operator can confirm the accurate result even immediately after the shooting. Further, the display change between the low-quality image and the high-quality image described in Modification 9 and the like may be, for example, a change in the display between the analysis result of the low-quality image and the analysis result of the high-quality image.

Here, the various learned models described above can be obtained by machine learning using learning data. Machine learning includes, for example, deep learning including a multi-layer neural network. In addition, for example, a convolutional neural network (CNN) may be used as a machine learning model for at least a part of the multi-layer neural network. Further, a technology related to an auto encoder (self-encoder) may be used for at least a part of the multi-layer neural network. In addition, a technique related to back propagation (error back propagation method) may be used for learning. However, the machine learning is not limited to deep learning, and may be any learning as long as it uses a model capable of extracting (expressing) the feature amount of learning data such as an image by learning. Here, the machine learning model refers to a learning model based on a machine learning algorithm such as deep learning. The learned model is a model that is trained (learned) with appropriate learning data in advance with respect to a machine learning model by an arbitrary machine learning algorithm. However, it is assumed that the learned model does not perform any further learning but can perform additional learning. The learning data is composed of a pair of input data and output data (correct answer data). Here, the learning data may be referred to as teacher data, or the correct answer data may be referred to as teacher data.

Note that the GPU can perform efficient operations by processing more data in parallel. Therefore, when learning is performed a plurality of times using a learning model such as deep learning, it is effective to perform processing with the GPU. Therefore, in the present modification, a GPU is used in addition to the CPU for the processing by the

image processing units

320, 1620, 1920, which is an example of a learning unit (not shown). Specifically, when the learning program including the learning model is executed, the CPU and the GPU cooperate to perform the learning to perform the learning. The processing of the learning unit may be calculated only by the CPU or GPU. Further, the processing unit (estimation unit) that executes the processing using the various learned models described above may use the GPU similarly to the learning unit. Further, the learning unit may include an error detection unit and an update unit (not shown). The error detection unit obtains an error between the correct data and the output data output from the output layer of the neural network according to the input data input to the input layer. The error detection unit may use a loss function to calculate the error between the output data from the neural network and the correct answer data. The updating unit updates the connection weighting coefficient between the nodes of the neural network based on the error obtained by the error detecting unit so that the error becomes small. The updating unit updates the combination weighting coefficient and the like by using the error back propagation method, for example. The error back-propagation method is a method of adjusting the coupling weighting coefficient between the nodes of each neural network so that the above error becomes small.

In addition, as a machine learning model used for high image quality and segmentation, there are a function of an encoder composed of a plurality of layers including a plurality of downsampling layers and a function of a decoder composed of a plurality of layers including a plurality of upsampling layers. A U-net type machine learning model having is applicable. In the U-net type machine learning model, ambiguous position information (spatial information) in a plurality of layers configured as encoders is converted into a same-dimensional layer (layers corresponding to each other) in a plurality of layers configured as decoders. ) Is used (for example, using a skip connection).

Also, as a machine learning model used for image quality enhancement, segmentation, etc., for example, FCN (Fully Concurrent Network), SegNet, or the like can be used. Alternatively, a machine learning model that performs object recognition in units of regions may be used according to a desired configuration. As the machine learning model for object recognition, for example, RCNN (Region CNN), fastRCNN, or fastRCNN can be used. Further, YOLO (You Only Look Once), SSD (Single Shot Detector, or Single Shot MultiBox Detector) can be used as a machine learning model for recognizing objects in units of areas.

The machine learning model may be, for example, a capsule network (CapsNet). Here, in a general neural network, each unit (each neuron) is configured to output a scalar value, so that, for example, spatial information regarding a spatial positional relationship (relative position) between features in an image is obtained. It is configured to be reduced. As a result, for example, it is possible to perform learning such that the effects of local distortion and parallel movement of the image are reduced. On the other hand, in the capsule network, each unit (each capsule) is configured to output the spatial information as a vector, and thus is configured to hold the spatial information, for example. Thereby, for example, learning can be performed in consideration of the spatial positional relationship between the features in the image.

Further, the high image quality model (learned model for high image quality) may be a learned model obtained by additionally learning the learning data including at least one high image quality image generated by the high image quality model. Good. At this time, whether or not to use the high-quality image as learning data for additional learning may be configured to be selectable by an instruction from the examiner. Note that these configurations are applicable not only to the learned model for improving image quality, but also to the various learned models described above. In addition, a learned model for generating correct answer data for generating correct answer data such as labeling (annotation) may be used for generating correct answer data used for learning various learned models described above. At this time, the learned model for generating correct answer data may be obtained by performing additional learning (sequentially) on correct answer data obtained by labeling (annotating) the examiner. That is, the learned model for generating correct answer data may be obtained by additionally learning the learning data in which the data before labeling is the input data and the data after the labeling is the output data. Further, in a plurality of consecutive frames such as a moving image, it is configured to correct the result of a frame determined to have low accuracy in consideration of the results of object recognition and segmentation of preceding and following frames. Good. At this time, the corrected result may be additionally learned as correct answer data in response to an instruction from the examiner.

In the various embodiments and modifications described above, when detecting a region of the eye to be examined using a learned model for object recognition or a learned model for segmentation, a predetermined image processing is performed for each detected region. Can also be applied. For example, consider the case of detecting at least two regions of the vitreous region, the retina region, and the choroid region. In this case, when image processing such as contrast adjustment is performed on at least two detected areas, different image processing parameters are used to perform adjustment suitable for each area. By displaying the image adjusted for each area, the operator can more appropriately diagnose the disease or the like in each area. The configuration using different image processing parameters for each detected region may be similarly applied to the region of the eye to be detected detected without using the learned model, for example.

(Modification 11)
In the preview screens in the various embodiments and modifications described above, the learned model for improving image quality described above may be used for each at least one frame of the live moving image. At this time, when a plurality of live moving images of different parts or different types are displayed on the preview screen, the learned model corresponding to each live moving image may be used. Accordingly, for example, even in the case of a live moving image, the processing time can be shortened, so that the examiner can obtain highly accurate information before the start of imaging. Therefore, for example, failure in re-imaging can be reduced, so that the accuracy and efficiency of diagnosis can be improved.

The plurality of live moving images may be, for example, a moving image of the anterior segment for alignment in the XYZ directions, and a front moving image of the fundus for focus adjustment and OCT focus adjustment of the fundus observation optical system. Further, the plurality of live moving images may be, for example, a tomographic moving image of the fundus for adjusting the coherence gate of OCT (adjusting the optical path length difference between the measurement optical path length and the reference optical path length). At this time, the various adjustments described above may be performed so that the region detected using the learned model for object recognition or the learned model for segmentation described above satisfies a predetermined condition. For example, a value (for example, a contrast value or an intensity value) relating to a vitreous region or a predetermined retinal layer such as RPE detected using a learned model for object recognition or a learned model for segmentation exceeds a threshold value ( Alternatively, various adjustments such as OCT focus adjustment may be performed so that the peak value is reached. In addition, for example, the OCT of OCT is performed so that a predetermined retinal layer such as a vitreous region or RPE detected using a learned model for object recognition or a learned model for segmentation is at a predetermined position in the depth direction. The coherence gate adjustment may be performed.

In these cases, the image

quality improving units

322 and 1622 can perform the image quality improving process on the moving image by using the learned model to generate the high image quality moving image. Further, the drive control unit 330, in a state in which a high-quality moving image is displayed, sets the shooting range of the reference mirror 221 or the like so that any one of the different regions specified by the segmentation process or the like is at a predetermined position in the display region. It is possible to drive and control the optical member that changes In such a case, the

control unit

30, 1600, 1900 can automatically perform the alignment process based on the highly accurate information so that the desired region is located at a predetermined position in the display region. The optical member for changing the shooting range may be, for example, an optical member for adjusting the coherence gate position, and specifically, may be the reference mirror 221 or the like. Further, the coherence gate position can be adjusted by an optical member that changes the optical path length difference between the measurement optical path length and the reference optical path length, and the optical member is, for example, for changing the optical path length of the measurement light (not shown). It may be a mirror or the like. The optical member that changes the shooting range may be the stage unit 25, for example.

The moving image to which the learned model described above can be applied is not limited to the live moving image, but may be, for example, a moving image stored (saved) in the storage unit. At this time, for example, a moving image obtained by aligning at least one frame of the fundus tomographic moving image stored (saved) in the storage unit may be displayed on the display screen. For example, in order to preferably observe the vitreous region, first, a reference frame may be selected based on the condition that the vitreous region exists on the frame as much as possible. At this time, each frame is a tomographic image (B scan image) in the XZ direction. Then, a moving image in which another frame is aligned in the XZ direction with respect to the selected reference frame may be displayed on the display screen. At this time, for example, the high-quality images (high-quality frames) sequentially generated by using the learned model for high image quality may be continuously displayed for each at least one frame of the moving image.

Note that as the above-described method of alignment between frames, the same method may be applied to the method of alignment in the X direction and the method of alignment in the Z direction (depth direction), or different methods may be used. May be applied. Further, the alignment in the same direction may be performed a plurality of times by different methods, for example, the precise alignment may be performed after performing the rough alignment. Further, as a method of alignment, for example, there is alignment (coarse in the Z direction) using a retinal layer boundary obtained by segmenting a tomographic image (B scan image). Further, as an alignment method, for example, there is an alignment (precision in the X direction and Z direction) using correlation information (similarity) between a plurality of regions obtained by dividing the tomographic image and the reference image. . Furthermore, as the alignment method, for example, alignment (in the X direction) using a one-dimensional projection image generated for each tomographic image (B scan image) and use of a two-dimensional front image (in the X direction) are used. There is alignment etc. Further, it may be configured such that rough alignment is performed in pixel units and then precise alignment is performed in subpixel units.

Here, during various adjustments, the subject such as the retina of the eye to be inspected may not be able to image well yet. Therefore, since there is a large difference between the medical image input to the learned model and the medical image used as the learning data, a high-quality image may not be obtained accurately. Therefore, when the evaluation value such as the image quality evaluation of the tomographic image (B scan) exceeds the threshold value, the display of the high-quality moving image (continuous display of high-quality frames) may be automatically started. Further, when the evaluation value such as the image quality evaluation of the tomographic image (B scan) exceeds the threshold value, the image quality improving button may be changed to a state (active state) that can be designated by the examiner.

Further, it is configured such that a learned model for high image quality that is different for each shooting mode having a different scanning pattern or the like is prepared and a learned model for image quality improvement corresponding to the selected shooting mode is selected. Good. Further, one learned model for image quality improvement obtained by learning the learning data including various medical images obtained in different photographing modes may be used.

(Modification 12)
In the various embodiments and modifications described above, when various learned models are undergoing additional learning, it may be difficult to output (infer / predict) using the learned models themselves during additional learning. Therefore, it is preferable to prohibit the input of the medical image to the learned model during the additional learning. Further, the same learned model as the learned model in the additional learning may be prepared as another preliminary learned model. At this time, it is preferable that the medical image can be input to the preliminary learned model during the additional learning. After the additional learning is completed, the learned model after the additional learning is evaluated, and if there is no problem, the preliminary learned model may be replaced with the learned model after the additional learning. Further, if there is a problem, a preliminary learned model may be used.

Also, the learned model obtained by learning for each imaging region may be selectively used. Specifically, a learning including a first learned model obtained using learning data including a first imaged region (lung, eye to be examined, etc.) and a second imaged region different from the first imaged region A plurality of trained models including a second trained model obtained using the data can be prepared. Then, the

image processing units

320, 1620, and 1920 may have a selection unit that selects one of these plurality of learned models. At this time, the

image processing units

320, 1620, and 1920 may include a control unit that executes additional learning on the selected learned model. In response to an instruction from the examiner, the control means searches for data in which the imaged region corresponding to the selected learned model and the imaged image of the imaged region are paired, and the data obtained by the search is used as learning data. The learning can be performed as additional learning on the selected trained model. The imaged region corresponding to the selected learned model may be acquired from the information in the header of the data or manually input by the examiner. Further, the data search may be performed, for example, from a server or the like of an external facility such as a hospital or a laboratory via a network. This makes it possible to efficiently perform additional learning for each imaged region using the imaged image of the imaged region corresponding to the learned model.

Note that the selection unit and the control unit may be configured by software modules executed by a processor such as the CPU or MPU of the

control unit

30, 1600, 1900. Further, the selection means and the control means may be configured by a circuit such as an ASIC that performs a specific function, an independent device, or the like.

In addition, when acquiring the learning data for additional learning from the server etc. of external facilities such as hospitals and research institutes via the network, it is necessary to reduce tampering and decrease in reliability due to system troubles during additional learning. Is useful. Therefore, the validity of the learning data for additional learning may be detected by confirming the matching by a digital signature or hashing. Thereby, the learning data for additional learning can be protected. At this time, if the legitimacy of the learning data for additional learning cannot be detected as a result of checking the consistency by digital signature or hashing, a warning to that effect is given, and additional learning by the learning data is performed. Make it not exist. The server may be in any form such as a cloud server, a fog server, an edge server, or the like, regardless of its installation location.

(Modification 13)
In the above-described various embodiments and modifications, the instruction from the examiner may be an instruction by voice or the like as well as a manual instruction (for example, an instruction using a user interface or the like). At this time, for example, a machine learning model including a voice recognition model obtained by machine learning (a voice recognition engine, a learned model for voice recognition) may be used. Further, the manual instruction may be an instruction by character input using a keyboard, a touch panel, or the like. At this time, for example, a machine learning model including a character recognition model (character recognition engine, learned model for character recognition) obtained by machine learning may be used. Further, the instruction from the examiner may be an instruction such as a gesture. At this time, a machine learning model including a gesture recognition model (gesture recognition engine, learned model for gesture recognition) obtained by machine learning may be used.

Further, the instruction from the examiner may be a result of detecting the line of sight of the examiner on the display screen of the display unit 50. The line-of-sight detection result may be, for example, a pupil detection result using a moving image of the examiner obtained by photographing the periphery of the display screen of the display unit 50. At this time, the above-described object recognition engine may be used to detect the pupil from the moving image. Further, the instruction from the examiner may be an instruction by an electroencephalogram, a weak electric signal flowing through the body, or the like.

In such a case, for example, as the learning data, character data or voice data (waveform data) indicating the instruction to display the result of the processing of the various learned models as described above is used as the input data, and various learned data is obtained. It may be learning data in which the correct instruction data is an execution command for actually displaying the result of the model processing on the display unit. As the learning data, for example, character data or voice data indicating a display instruction of a high-quality image obtained by a learned model for high image quality is used as input data, and a high-quality image display execution command and a diagram are displayed. 22A and 22B may be learning data in which an execution command for changing a button 2220 to an active state is correct data. The learning data may be anything as long as the instruction content indicated by the character data or the voice data and the execution instruction content correspond to each other. In addition, voice data may be converted into character data by using an acoustic model or a language model. Further, the waveform data obtained by a plurality of microphones may be used to perform the process of reducing the noise data superimposed on the voice data. Further, it may be configured such that an instruction by a character or a voice or an instruction by a mouse or a touch panel can be selected according to an instruction from an examiner. Further, on / off of the instruction by characters or voice may be configured to be selectable according to the instruction from the examiner.

Here, the machine learning includes deep learning as described above, and a recursive neural network (RNN) can be used as at least a part of the multi-layered neural network, for example. Here, as an example of a machine learning model according to the present modification, an RNN that is a neural network that handles time series information will be described with reference to FIGS. 24A and 24B. Further, a long short-term memory (hereinafter, LSTM), which is a type of RNN, will be described with reference to FIGS. 25A and 25B.

FIG. 24A shows the structure of RNN which is a machine learning model. The RNN 2420 has a loop structure in the network, receives the data x ^t 2410 at time t, and outputs the data h ^t 2430. Since the RNN 2420 has a loop function in the network, it is possible to take over the state at the current time to the next state, so that time series information can be handled. FIG. 24B shows an example of input / output of the parameter vector at time t. The data x ^t 2410 includes N pieces of data (Params1 to ParamsN). Further, the data h ^t 2430 output from the RNN 2420 includes N (Params 1 to ParamsN) data corresponding to the input data.

However, since the RNN cannot handle long-term information at the time of error back propagation, the LSTM may be used. The LSTM can learn long-term information by including a forgetting gate, an input gate, and an output gate. Here, the structure of the LSTM is shown in FIG. 25A. In the LSTM2540, the information the network takes over at the next time t is the internal state c ^{t-1 of the} network called a cell and the output data h ^t-1 . The lower case letters (c, h, x) in the figure represent vectors.

Next, FIG. 25B shows details of the LSTM2540. In FIG. 25B, a forgetting gate network FG, an input gate network IG, and an output gate network OG are shown, each being a sigmoid layer. Therefore, a vector in which each element has a value of 0 to 1 is output. The forgetting gate network FG determines how much past information is retained, and the input gate network IG determines which value is updated. Further, in FIG. 25B, a cell update candidate network CU is shown, and the cell update candidate network CU is an activation function tanh layer. This creates a new vector of candidate values that will be added to the cell. The output gate network OG selects a cell candidate element and selects how much information is transmitted at the next time.

Note that the above-mentioned LSTM model is a basic form, so it is not limited to the network shown here. The connection between networks may be changed. QRNN (Quasi Current Neural Network) may be used instead of LSTM. Furthermore, the machine learning model is not limited to the neural network, and boosting, support vector machine, or the like may be used. Further, when the instruction from the inspector is input by characters or voice, a technology related to natural language processing (for example, Sequence to Sequence) may be applied. Further, a dialogue engine (a dialogue model, a learned model for dialogue) that responds to the examiner with an output such as text or voice may be applied.

(Modification 14)
In the various embodiments and modifications described above, the high-quality image, the label image, and the like may be stored in the storage unit according to an instruction from the operator. At this time, for example, after the instruction from the operator to save the high-quality image, when registering the file name, as a recommended file name, any part of the file name (for example, the first part, or At the end), the file name including the information (for example, characters) indicating that the image is generated by the process using the learned model for image quality improvement (image quality improvement process) It may be displayed in an editable state in response to an instruction. Similarly, for the boundary image, the region label image, and the like, a file name including information that is an image generated by the process using the learned model may be displayed.

Further, when displaying a high-quality image on the display unit 50 on various display screens such as a report screen, the displayed image is a high-quality image generated by a process using a learned model for high image quality. May be displayed together with the high quality image. In this case, the operator can easily identify by the display that the displayed high-quality image is not the image itself obtained by photographing, and thus the false diagnosis can be reduced or the diagnostic efficiency can be improved. be able to. It should be noted that the display indicating that the image is a high-quality image generated by the process using the learned model for high image quality is a display that can distinguish the input image from the high-quality image generated by the process. Any form may be used. Further, not only the process using the learned model for image quality improvement, but also the process using the various learned models as described above is the result generated by the process using the learned model of the type. An indication that there is may be displayed with the result. Also, when displaying the analysis result of the segmentation result using the learned model for segmentation processing, a display indicating that it is an analysis result based on the result using the learned model for segmentation is displayed together with the analysis result. It may be displayed.

At this time, the display screen such as the report screen may be saved in the storage unit as image data according to an instruction from the operator. For example, the report screen may be stored in the storage unit as one image in which high-quality images and the like and a display indicating that these images are images generated by the process using the learned model are lined up.

In addition, regarding the display indicating that the image is a high-quality image generated by the process using the learned model for high image quality, the learned model for high image quality learned by what learning data. A display indicating whether there is may be displayed on the display unit. The display may include a description of the types of the input data and the correct answer data of the learning data, and an arbitrary display regarding the correct answer data such as the imaging region included in the input data and the correct answer data. In addition, for the processing using the various learned models described above such as the segmentation processing, a display indicating what kind of learning data the learned model of the type learned is displayed on the display unit. May be.

Also, information (for example, characters) indicating that the image is generated by the process using the learned model may be displayed or saved in a state of being superimposed on the image or the like. At this time, the portion to be superimposed on the image may be any portion as long as it is an area (for example, the edge of the image) that does not overlap with the area in which the attention site or the like to be imaged is displayed. Alternatively, a non-overlapping area may be determined and superimposed on the determined area. Note that not only the processing using the learned model for image quality improvement, but also the image obtained by the processing using the above-described various learned models such as segmentation processing may be similarly processed.

If the button 2220 as shown in FIGS. 22A and 22B is set as the default display screen of the report screen so that the button 2220 is in an active state (image quality improvement processing is turned on), an instruction from the examiner is given. According to the above, the report image corresponding to the report screen including the high quality image may be transmitted to the server. When the button 2220 is set to the active state by default, at the end of the examination (for example, when the photographing confirmation screen or the preview screen is changed to the report screen in response to an instruction from the examiner) In addition, the report image corresponding to the report screen including the high-quality image may be (automatically) transmitted to the server. At this time, various settings in the default settings (for example, a depth range for generating an En-Face image on the initial display screen of the report screen, presence / absence of analysis map superimposition, whether or not a high-quality image is displayed, a display screen for follow-up observation) The report image may be configured to be transmitted to the server based on at least one setting such as whether or not. Note that the same processing may be performed when the button 2220 represents switching of segmentation processing.

(Modification 15)
In the various examples and modifications described above, among various learned models as described above, images obtained by the first type of learned models (for example, analysis results of high-quality images, analysis maps, etc. are shown. The image, the image showing the object recognition result, and the image showing the segmentation result) may be input to the trained model of the second type different from the first type. At this time, a result (for example, an analysis result, a diagnosis result, an object recognition result, a segmentation result) by processing the second type of learned model may be generated.

In addition, of the various learned models as described above, the first type using the result (for example, the analysis result, the diagnosis result, the object recognition result, the segmentation result) of the processing of the first type of the learned model. The image input to the learned model of the first type different from the first type may be generated from the image input to the learned model of. At this time, the generated image is highly likely to be an image suitable as an image to be processed using the second type of learned model. Therefore, an image obtained by inputting the generated image to the second type of learned model (for example, a high-quality image, an image showing the analysis result of an analysis map, an image showing the object recognition result, a segmentation result The accuracy of the image shown) can be improved.

Alternatively, a similar case image search may be performed using an external database stored in a server or the like, using the analysis result or diagnosis result of the processing of the learned model as described above as a search key. In addition, when a plurality of images stored in the database are already managed by the machine learning or the like with the feature amount of each of the plurality of images as additional information, the image itself is used as a search key. A similar case image search engine (similar case image search model, learned model for similar case image search) may be used. For example, the

image processing units

320, 1620, and 1920 are different from the learned model for generating the high-quality image (second medical image), which is a learned model for similar case image retrieval (sixth learned model). Using, the similar case image can be searched for each of the different regions identified by the segmentation process or the like.

(Modification 16)
It should be noted that the process of generating motion contrast data in the above-described embodiment and modification is not limited to the configuration performed based on the brightness value of the tomographic image. The above-described various processes are performed on the tomographic data including the interference signal acquired by the imaging unit 20, the signal obtained by performing the Fourier transform on the interference signal, the signal obtained by subjecting the signal to an arbitrary process, and the tomographic image based on these signals. May be applied. Also in these cases, the same effect as that of the above configuration can be obtained.

Further, the image processing such as the gradation conversion processing in the above-described embodiments and modifications is not limited to the configuration performed based on the brightness value of the tomographic image. The various processes described above may be applied to tomographic data including an interference signal acquired by the imaging unit 20, a signal obtained by subjecting the interference signal to Fourier transform, a signal obtained by subjecting the signal to an arbitrary process, and the like. Also in these cases, the same effect as that of the above configuration can be obtained.

Furthermore, although the spectral domain OCT (SD-OCT) device using the SLD as the light source is described as the OCT device in the above-described embodiments and modifications, the configuration of the OCT device according to the present invention is not limited to this. For example, the present invention can be applied to any other type of OCT device such as a wavelength swept OCT (SS-OCT) device using a wavelength swept light source capable of sweeping the wavelength of emitted light. The present invention can also be applied to a Line-OCT device (or an SS-Line-OCT device) using line light. Further, the present invention can also be applied to a Full Field-OCT device (or SS-Full Field-OCT device) using area light.

In the above embodiments and modifications, an optical fiber optical system using a coupler is used as the splitting means, but a spatial optical system using a collimator and a beam splitter may be used. Further, the configuration of the image capturing unit 20 is not limited to the above configuration, and a part of the configuration included in the image capturing unit 20 may be a configuration separate from the image capturing unit 20.

In addition, in the above-described embodiment and modification, the acquisition unit 310 acquires the interference signal acquired by the imaging unit 20, the tomographic image generated by the image processing unit 320, and the like. However, the configuration in which the acquisition unit 310 acquires these signals and images is not limited to this. For example, the acquisition unit 310 may acquire these signals from a server or a photographing device that is connected to the

control units

30, 1600, 1900 via LAN, WAN, the Internet, or the like.

Further, the learning data of various learned models is not limited to the data obtained by using the ophthalmologic apparatus itself that actually performs imaging, depending on the desired configuration, the data obtained by using the same type of ophthalmologic apparatus, or the same type. It may be data obtained by using an ophthalmologic apparatus.

Note that the various learned models according to the above-described embodiments and modified examples can be provided in the

control units

30, 1600, 1900. The learned model may be composed of, for example, a CPU, a software module executed by a processor such as MPU, GPU, FPGA, or the like, or a circuit that performs a specific function such as ASIC. Further, these learned models may be provided in a device of another server connected to the

control units

30, 1600, 1900. In this case, the

control units

30, 1600, 1900 can use the learned model by connecting to a server or the like having the learned model via an arbitrary network such as the Internet. Here, the server including the learned model may be, for example, a cloud server, a fog server, an edge server, or the like.

In addition, although the tomographic image of the fundus of the eye to be inspected has been described in the above-described embodiments and modifications, the image processing may be performed on the tomographic image of the anterior segment of the eye to be inspected. In this case, the regions to be subjected to different image processing in the tomographic image include regions such as the crystalline lens, cornea, iris, and anterior chamber of the eye. Note that the region may include another region of the anterior segment. Further, the region of the tomographic image regarding the fundus portion is not limited to the vitreous portion, the retina portion, and the choroid portion, and may include other regions regarding the fundus portion. Here, since the tomographic image regarding the fundus portion has a wider gradation than the tomographic image regarding the anterior segment, the image quality can be more effectively improved by the image processing according to the above-described embodiments and modifications. .

Also, in the above-described embodiments and modifications, the subject's eye was described as an example, but the subject is not limited to this. For example, the subject may be skin or another organ. In this case, the OCT apparatus according to the above-described embodiments and modifications can be applied to medical equipment such as an endoscope in addition to the ophthalmologic apparatus.

(Modification 17)
Further, the image processed by the image processing device or the image processing method according to the above-described various embodiments and modifications includes a medical image acquired by using an arbitrary modality (imaging device, imaging method). The medical image to be processed can include a medical image acquired by an arbitrary imaging device or the like, or an image created by the image processing device or the image processing method according to the above-described embodiments and modifications.

Furthermore, the medical image to be processed is an image of a predetermined part of the subject (subject), and the image of the predetermined part includes at least a part of the predetermined part of the subject. In addition, the medical image may include other parts of the subject. Further, the medical image may be a still image or a moving image, and may be a monochrome image or a color image. Further, the medical image may be an image showing the structure (morphology) of a predetermined part or an image showing its function. The image representing the function includes images representing blood flow dynamics (blood flow rate, blood flow velocity, etc.) such as an OCTA image, a Doppler OCT image, an fMRI image, and an ultrasonic Doppler image. The predetermined part of the subject may be determined according to the imaging target, human eyes (inspection eye), brain, lungs, intestines, heart, pancreas, kidneys, organs such as liver, head, chest, It includes any part such as legs and arms.

Also, the medical image may be a tomographic image of the subject or a front image. The front image is, for example, a front image of the fundus of the eye, a front image of the anterior segment of the eye, a fundus image obtained by fluorescence imaging, or at least a part of a range (three-dimensional OCT data) acquired by OCT in the depth direction of the imaging target. The En-Face image generated using the data of 1. is included. The En-Face image is an OCTA En-Face image (motion contrast front image) generated by using at least a partial range of data in the depth direction of the imaging target for the three-dimensional OCTA data (three-dimensional motion contrast data). ) Is okay. The three-dimensional OCT data and the three-dimensional motion contrast data are examples of the three-dimensional medical image data.

Here, the motion contrast data is data indicating a change between a plurality of volume data obtained by controlling the measurement light to be scanned a plurality of times in the same region (same position) of the eye to be inspected. At this time, the volume data is composed of a plurality of tomographic images obtained at different positions. Then, the motion contrast data can be obtained as the volume data by obtaining the data indicating the change between the plurality of tomographic images obtained at the substantially same position at each of the different positions. The motion contrast front image is also referred to as an OCTA front image (OCTA En-Face image) regarding OCT angiography (OCTA) for measuring the movement of blood flow, and the motion contrast data is also referred to as OCTA data. The motion contrast data can be obtained, for example, as a decorrelation value between two tomographic images or corresponding interference signals, a variance value, or a value obtained by dividing the maximum value by the minimum value (maximum value / minimum value). , May be obtained by any known method. At this time, the two tomographic images can be obtained, for example, by controlling so that the measurement light is scanned a plurality of times in the same region (same position) of the subject's eye.

The En-Face image is, for example, a front image generated by projecting data in the range between two layer boundaries in the XY directions. At this time, the front image is a depth range of at least a part of the volume data (three-dimensional tomographic image) obtained by using optical interference, and corresponds to the depth range determined based on the two reference planes. Is projected or integrated on a two-dimensional plane to be generated. The En-Face image is a front image generated by projecting, on a two-dimensional plane, data corresponding to the depth range determined based on the detected retinal layer in the volume data. As a method of projecting the data corresponding to the depth range determined based on the two reference planes onto the two-dimensional plane, for example, the representative value of the data within the depth range is set as the pixel value on the two-dimensional plane. Techniques can be used. Here, the representative value can include a value such as an average value, a median value, or a maximum value of the pixel values within the range in the depth direction of the area surrounded by the two reference planes. Further, the depth range related to the En-Face image is a range including a predetermined number of pixels in a deeper direction or a shallower direction with reference to one of the two layer boundaries regarding the detected retinal layer, for example. Good. In addition, the depth range related to the En-Face image may be, for example, a range that is changed (offset) in accordance with an operator's instruction from a range between two layer boundaries related to the detected retinal layer. Good.

Also, the image capturing device is a device for capturing an image used for diagnosis. The imaging device detects, for example, a device that obtains an image of a predetermined region by irradiating a predetermined region of a subject with light, radiation such as X-rays, electromagnetic waves, or ultrasonic waves, or radiation emitted from a subject. It includes a device for obtaining an image of a predetermined part by doing so. More specifically, the imaging apparatus according to the various embodiments and modifications described above includes at least an X-ray imaging apparatus, a CT apparatus, an MRI apparatus, a PET apparatus, a SPECT apparatus, an SLO apparatus, an OCT apparatus, an OCTA apparatus, and a fundus. It includes a camera and an endoscope.

The OCT device may include a time domain OCT (TD-OCT) device and a Fourier domain OCT (FD-OCT) device. Further, the Fourier domain OCT device may include a spectral domain OCT (SD-OCT) device and a wavelength swept OCT (SS-OCT) device. The SLO device and the OCT device may include a wavefront compensation SLO (AO-SLO) device using a wavefront compensation optical system, a wavefront compensation OCT (AO-OCT) device, and the like. Further, the SLO device and the OCT device may include a polarization SLO (PS-SLO) device and a polarization OCT (PS-OCT) device for visualizing information on the polarization phase difference and depolarization.

Further, in the learned model for improving image quality according to the above-described various embodiments and modified examples, the magnitude of the brightness value of the tomographic image, the order and inclination of the bright part and the dark part, the position, the distribution, the continuity, etc. It can be considered that it is extracted as a part of and used for the estimation process. Similarly, even in the learned model for segmentation processing, image analysis, and diagnostic result generation, one of the feature values is the brightness value of the tomographic image, the order and inclination of the bright and dark parts, the position, the distribution, and the continuity. It can be considered that it is extracted as a part and used for the estimation process. On the other hand, in learned models for voice recognition, character recognition, gesture recognition, etc., learning is performed using time-series data, so the slope between input continuous time-series data values is characteristic. It is considered that it is extracted as a part of the quantity and used for the estimation process. Therefore, such a learned model is expected to be able to perform accurate estimation by using the influence of a concrete change of a numerical value in the estimation process.

According to the above-described embodiment and modification, it is possible to generate an image in which appropriate image processing is performed for each observation target area.

(Other Examples)
The present invention also provides a process for supplying a program that implements one or more functions of the above-described embodiments and modifications to a system or apparatus via a network or a storage medium, and a computer of the system or apparatus reads and executes the program. It is feasible. A computer has one or more processors or circuits and may include separate computers or networks of separate processors or circuits for reading and executing computer-executable instructions.

The processor or circuit may include a central processing unit (CPU), a micro processing unit (MPU), a graphics processing unit (GPU), an application specific integrated circuit (ASIC), or a field programmable gateway (FPGA). Also, the processor or circuit may include a digital signal processor (DSP), a data flow processor (DFP), or a neural processing unit (NPU).

The present invention is not limited to the above embodiments and modifications, and various changes and modifications can be made without departing from the spirit and scope of the present invention. Therefore, the following claims are appended to make the scope of the present invention public.

This application claims priority based on Japanese Patent Application No. 2018-191449 filed October 10, 2018 and Japanese Patent Application No. 2019-183106 filed October 3, 2019. , The entire contents of which are incorporated herein.

30: control unit (image processing device), 310: acquisition unit, 322: image quality improvement unit

Claims

An acquisition unit that acquires a first medical image of the subject;
An image quality improving unit that uses the learned model to generate a second medical image from the first medical image such that different regions in the first medical image have undergone different image processing;
An image processing apparatus comprising:
An acquisition unit that acquires a first medical image of the subject;
Using the learned model, a high-quality image of a different region of the first medical image and the second region different from the first region is formed from the first medical image. A high-quality image generating unit for generating a medical image of 2;
An image processing apparatus comprising:
The learning data of the learned model is a medical image obtained by photographing a subject, and includes a medical image that has been subjected to gradation conversion processing corresponding to any of different regions in the medical image. The image processing device according to 1 or 2.
The learning data of the learned model is a medical image obtained by photographing a subject, and includes a medical image acquired in a photographing mode corresponding to any of different regions in the medical image. The image processing device according to any one of 3 above.
The first medical image and the second medical image are tomographic images,
The image processing apparatus according to claim 1, wherein the first medical image is a tomographic image obtained by using optical interference.
Further comprising a selection unit for selecting image processing to be applied to the first medical image according to an instruction from an operator,
The image quality improving unit performs a gradation conversion process on the first medical image based on the image process selected by the selecting unit without using the learned model to generate a third medical image. Alternatively, the image processing apparatus according to claim 1, wherein the second medical image is generated from the first medical image using the learned model.
In the second medical image, the image quality improving unit determines a pixel value of a connecting portion of a plurality of different regions in the first medical image based on pixel values of pixels around the connecting portion, or The image processing device according to claim 1, wherein the pixel values of surrounding pixels are modified based on the pixel values of the connection portion.
The subject is an eye to be inspected,
The image processing apparatus according to claim 1, wherein the different region includes at least one of a retina, a vitreous body, a choroid, a lens, a cornea, an iris, and an anterior chamber.
9. The learning data of the learned model includes an image obtained by one of a superimposing process, a maximum posterior probability estimating process, a smoothing filter process, and a gradation converting process. The image processing device according to one item.
The learning data of the learned model is a medical image captured by an image capturing device having a higher performance than an image capturing device used for capturing the first medical image, or a man-hour required for the first medical image capturing step. The image processing apparatus according to claim 1, wherein the image processing apparatus includes medical images acquired in many imaging steps.
The learning data of the learned model is a medical image obtained by photographing a subject, and the medical image is obtained with respect to a medical image acquired in a photographing mode corresponding to any of different regions in the medical image. The image processing apparatus according to claim 1, further comprising a medical image that has been subjected to gradation conversion processing corresponding to any of different areas in.
The learning data of the learned model is one of different regions in the image with respect to the image obtained by one of the superimposing process, the maximum posterior probability estimating process, the smoothing filtering process, and the gradation converting process. The image processing apparatus according to claim 1, comprising a medical image that has been subjected to gradation conversion processing corresponding to.
The learning data of the learned model has a medical image captured by an image capturing device having a higher performance than an image capturing device used for capturing the first medical image, or a man-hour required for the first medical image capturing process. 13. The medical image obtained by a large number of photographing steps includes a medical image that has been subjected to gradation conversion processing corresponding to any of different regions in the medical image, and the medical image is included in any one of claims 1 to 12. Image processing device.
The image processing apparatus according to any one of claims 1 to 13, further comprising an analysis unit that applies different analysis conditions to each of a plurality of mutually different regions in the second medical image.
A display control unit for controlling the display of the display unit,
The image processing device according to claim 14, wherein the display control unit causes the display unit to display an analysis result of each of the plurality of different regions by the analysis unit.
A display control unit for controlling the display of the display unit,
15. The display control unit causes the display unit to display, together with the second medical image, that the second medical image is an image generated using the learned model. The image processing device according to item 1.
The image quality improving unit is
Using a learned model different from the learned model for generating the second medical image, generating label images with different label values for the different regions from the first medical image,
The image processing device according to claim 1, wherein the second medical image is generated from the label image using a learned model for generating the second medical image.
17. The image processing unit according to claim 1, further comprising an image processing unit that identifies a different region in at least one of the first medical image and the second medical image from at least one image. Image processing device.
The image processing according to claim 18, wherein the image processing unit specifies a different region in the at least one image by using a learned model different from a learned model for generating the second medical image. apparatus.
The first medical image and the second medical image are moving images,
With the second medical image displayed as a moving image, a drive control unit that drives and controls an optical member that changes an imaging range such that one of the specified different areas is located at a predetermined position in the display area. The image processing device according to claim 18, further comprising:
21. The image processing unit generates an image analysis result for each of the specified different regions by using a learned model different from a learned model for generating the second medical image. The image processing device according to claim 1.
The image processing unit generates a diagnosis result for each of the specified different regions by using a learned model different from a learned model for generating the second medical image. The image processing device according to item 1.
The image processing unit provides information regarding a difference between a medical image obtained by using a hostile generation network or an auto encoder for each of the specified different areas and a medical image input to the hostile generation network or the auto encoder. The image processing device according to claim 18, wherein the image processing device is generated as information regarding an abnormal part.
24. The image processing unit searches for similar case images for each of the specified different regions using a learned model different from a learned model for generating the second medical image. The image processing device according to claim 1.
The first medical image and the second medical image are three-dimensional OCT tomographic images,
The image processing device according to any one of claims 18 to 24, wherein the image processing unit generates a front image corresponding to a partial depth range of the second medical image.
Acquiring a first medical image of the subject,
Using the learned model to generate a second medical image from the first medical image such that different regions in the first medical image have undergone different image processing;
An image processing method including:
Acquiring a first medical image of the subject,
Using the learned model, a high-quality image of a different region of the first medical image and the second region different from the first region is formed from the first medical image. Generating a medical image of step 2,
An image processing method including:
A program which, when executed by a processor, causes the processor to execute each step of the image processing method according to claim 26.