JP2022155690A

JP2022155690A - Image processing device, image processing method, and program

Info

Publication number: JP2022155690A
Application number: JP2021059043A
Authority: JP
Inventors: 学山添; Manabu Yamazoe; 好彦岩瀬; Yoshihiko Iwase; 弘樹内田; Hiroki Uchida; 律也富田; Ritsuya Tomita
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2022-10-14

Abstract

To use a medical image of an image size suitable for learning and inference (estimation) as input data for a learned model.SOLUTION: An image processing device is a learned model obtained by learning by learning data including input data which is a medical image having a first image size, and includes an image processing part for outputting a second image having a second image size as output data by inputting, to the learned model, a first image which is a medical image having a second image size larger than the first image size, as input data.SELECTED DRAWING: Figure 4

Description

開示の技術は、画像処理装置、画像処理方法、及びプログラムに関する。 The technology disclosed herein relates to an image processing device, an image processing method, and a program.

被検体の断層画像を得るための装置として、光干渉断層撮影法（ＯＣＴ：ＯｐｔｉｃａｌＣｏｈｅｒｅｎｃｅＴｏｍｏｇｒａｐｈｙ）を用いた装置（ＯＣＴ装置）が知られている。ＯＣＴ装置などの医用断層画像撮影装置を用いることで、網膜層内部の状態を三次元的に観察することが可能であり、このような医用断層画像撮影装置は、例えばＡＭＤのような眼科網膜疾患の診断に有用である。近年、臨床現場で用いられているＯＣＴは、例えば、高速に画像を取得する方法として、ＳＤ－ＯＣＴ（ＳｐｅｃｔｒａｌＤｏｍａｉｎＯＣＴ）、及びＳＳ－ＯＣＴ（ＳｗｅｐｔＳｏｕｒｃｅＯＣＴ）の２方式に大別される。ＳＤ－ＯＣＴでは、広帯域光源を用い、分光器でインターフェログラムを取得する。これに対し、ＳＳ－ＯＣＴでは、光源として高速波長掃引光源を用いることで、単一チャネル光検出器でスペクトル干渉を計測する。 An apparatus (OCT apparatus) using optical coherence tomography (OCT) is known as an apparatus for obtaining a tomographic image of a subject. By using a medical tomographic imaging apparatus such as an OCT apparatus, it is possible to observe the state inside the retinal layers three-dimensionally, and such a medical tomographic imaging apparatus is useful for ophthalmic retinal diseases such as AMD. is useful for the diagnosis of In recent years, OCT used in clinical practice is broadly classified into two methods, for example, SD-OCT (Spectral Domain OCT) and SS-OCT (Swept Source OCT) as a method of acquiring images at high speed. SD-OCT uses a broadband light source and acquires an interferogram with a spectrometer. In contrast, SS-OCT uses a fast wavelength swept light source as a light source to measure spectral interference with a single-channel photodetector.

最近は、両方式のＯＣＴにおいて、造影剤を用いないで血管を造影するＯＣＴ血管造影法（ＯＣＴＡｎｇｉｏｇｒａｐｈｙ：ＯＣＴＡ）が注目されてきた。ＯＣＴＡでは、ＯＣＴにより取得したＯＣＴ画像からモーションコントラストデータを生成する。ここで、モーションコントラストデータとは、測定対象の同一断面をＯＣＴで繰り返し撮影し、その撮影間における測定対象の時間的な変化を検出したデータであり、例えば、複素ＯＣＴ信号の位相やベクトル、強度の時間的な変化を差、比率、又は相関等から計算される。 Recently, OCT angiography (OCTA), in which blood vessels are imaged without using a contrast agent, has attracted attention in both types of OCT. OCTA generates motion contrast data from OCT images acquired by OCT. Here, the motion contrast data is data obtained by repeatedly photographing the same section of the object to be measured by OCT and detecting a temporal change in the object to be measured between the photographing. is calculated from the difference, ratio, correlation, or the like.

また、ＯＣＴＡ画像は、表示に際して、取得された三次元ＯＣＴ画像から算出された三次元モーションコントラストデータを二次元平面に投影することにより二次元化したＯＣＴＡ正面画像として表示されることが通例となりつつある。これに関して、投影するモーションコントラストデータの深さ方向の範囲を指定することで二次元正面画像を生成する技術が特許文献１に開示されている。さらに、低解像度画像と高解像度画像を用いた機械学習により得られた人工知能エンジンで、低解像度画像から高解像度画像を生成する技術が特許文献２に開示されている。 In addition, when displaying an OCTA image, it is becoming common to display a two-dimensional OCTA front image by projecting three-dimensional motion contrast data calculated from an acquired three-dimensional OCT image onto a two-dimensional plane. be. In this regard, Patent Document 1 discloses a technique for generating a two-dimensional front image by designating a range in the depth direction of motion contrast data to be projected. Furthermore, Patent Document 2 discloses a technique of generating a high-resolution image from a low-resolution image by an artificial intelligence engine obtained by machine learning using a low-resolution image and a high-resolution image.

特開２０１７－６１７９号公報JP 2017-6179 A 特開２０１８－５８４１号公報JP 2018-5841 A

ここで、一般的に、学習データの数は多い方がよい。また、推論処理（推定処理）は高速に行われる方がよい。 Here, generally, the larger the number of learning data, the better. Inference processing (estimation processing) should be performed at high speed.

そこで、開示の技術は、学習済モデルの入力データとして、学習と推論（推定）とに適した画像サイズの医用画像を用いることを目的の一つとする。 Therefore, one of the purposes of the technology disclosed herein is to use medical images having an image size suitable for learning and inference (estimation) as input data for a trained model.

なお、上記目的に限らず、後述する発明を実施するための形態に示す各構成により導かれる作用効果であって、従来の技術によっては得られない作用効果を奏することも本件の他の目的の１つとして位置付けることができる。 In addition to the above object, it is also another object of the present invention to achieve functions and effects that are derived from each configuration shown in the mode for carrying out the invention described later and that cannot be obtained by the conventional technology. can be positioned as one.

開示の技術のうち少なくとも１つの実施態様に係る画像処理装置は、第１の画像サイズを有する医用画像である入力データを含む学習データにより学習して得た学習済モデルであって、前記第１の画像サイズよりも大きい第２の画像サイズを有する医用画像である第１の画像を入力データとして前記学習済モデルに入力することにより、前記第２の画像サイズを有する第２の画像を出力データとして出力する画像処理部を備える。 An image processing apparatus according to at least one embodiment of the technology disclosed herein is a trained model obtained by learning using learning data including input data that is a medical image having a first image size, By inputting a first image, which is a medical image having a second image size larger than the image size of , to the trained model as input data, a second image having the second image size is output as output data and an image processing unit that outputs as

開示の技術のうち少なくとも１つの実施態様によれば、学習済モデルの入力データとして、学習と推論（推定）とに適した画像サイズの医用画像を用いることができる。 According to at least one embodiment of the disclosed technology, medical images having an image size suitable for learning and inference (estimation) can be used as input data for a trained model.

第１の実施形態に係るＯＣＴ装置の概略的な構成を示すブロック図である。1 is a block diagram showing a schematic configuration of an OCT apparatus according to a first embodiment; FIG. 第１の実施形態に係る撮影装置の概略的な構成を説明する図である。It is a figure explaining the schematic structure of the imaging device which concerns on 1st Embodiment. 第１の実施形態に係る高画質化部の概略的な構成を示すブロック図である。3 is a block diagram showing a schematic configuration of an image quality enhancing unit according to the first embodiment; FIG. 高画質化エンジンに関するニューラルネットワークの構成の一例を示す。An example of the configuration of a neural network for an image quality enhancement engine is shown. 高画質化処理に関する学習データの一例を示す。An example of learning data related to image quality improvement processing is shown. 高画質化処理に関する入力画像の一例を示す。An example of an input image for image quality improvement processing is shown. 第１の実施形態に係る画像処理の概略的な流れを示すフロー図である。FIG. 4 is a flow diagram showing a schematic flow of image processing according to the first embodiment; 第１の実施形態に係る画像処理の流れの一例を示すフロー図である。4 is a flow chart showing an example of the flow of image processing according to the first embodiment; FIG. 第１の実施形態に係る画像処理の流れの一例を示すフロー図である。4 is a flow chart showing an example of the flow of image processing according to the first embodiment; FIG. 第１の実施形態に係る画像処理の流れの一例を示すフロー図である。4 is a flow chart showing an example of the flow of image processing according to the first embodiment; FIG. 第１の実施形態に係る画像処理の流れの一例を示すフロー図である。4 is a flow chart showing an example of the flow of image processing according to the first embodiment; FIG. 第１の実施形態に係る画像処理の流れの一例を示すフロー図である。4 is a flow chart showing an example of the flow of image processing according to the first embodiment; FIG. 第１の実施形態に係る画像処理の流れの一例を示すフロー図である。4 is a flow chart showing an example of the flow of image processing according to the first embodiment; FIG. 第２の実施形態に係る画像処理の概略的な流れを示すフロー図である。FIG. 11 is a flowchart showing a schematic flow of image processing according to the second embodiment; 変形例８に係る機械学習モデルの一例を示す。11 shows an example of a machine learning model according to Modification 8. FIG. 変形例８に係る機械学習モデルの一例を示す。11 shows an example of a machine learning model according to Modification 8. FIG. 入力画像を分割して得た複数のサブセット領域の一例を示す。An example of a plurality of subset regions obtained by dividing an input image is shown. Ｔｒａｎｓｆｏｒｍｅｒで利用されるＳｅｌｆ－Ａｔｔｅｎｔｉｏｎの一例を示す。An example of Self-Attention used in Transformer is shown.

以下、本発明を実施するための例示的な実施形態を、図面を参照して詳細に説明する。ただし、以下の実施形態で説明する寸法、材料、形状、及び構成要素の相対的な位置等は任意であり、本発明が適用される装置の構成又は様々な条件に応じて変更できる。また、図面において、同一であるか又は機能的に類似している要素を示すために図面間で同じ参照符号を用いる。また、以下において、眼軸方向をＺ、眼底平面水平方向をＸ、眼底平面垂直方向をＹと記述する。 Exemplary embodiments for carrying out the present invention will now be described in detail with reference to the drawings. However, the dimensions, materials, shapes, relative positions of components, etc. described in the following embodiments are arbitrary and can be changed according to the configuration of the device to which the present invention is applied or various conditions. Also, the same reference numbers are used in the drawings to indicate identical or functionally similar elements. Also, hereinafter, the axial direction of the eye is described as Z, the horizontal direction of the fundus plane is described as X, and the vertical direction of the fundus plane is described as Y.

なお、以下において、機械学習モデルとは、機械学習アルゴリズムによる学習モデルをいう。機械学習の具体的なアルゴリズムとしては、最近傍法、ナイーブベイズ法、決定木、サポートベクターマシンなどが挙げられる。また、ニューラルネットワークを利用して、学習するための特徴量、結合重み付け係数を自ら生成する深層学習（ディープラーニング）も挙げられる。適宜、上記アルゴリズムのうち利用できるものを用いて以下の実施形態及び変形例に適用することができる。また、教師データとは、学習データのことをいい、入力データ及び出力データのペアで構成される。また、正解データとは、学習データ（教師データ）の出力データのことをいう。 In the following description, a machine learning model refers to a learning model based on a machine learning algorithm. Specific algorithms of machine learning include nearest neighbor method, naive Bayes method, decision tree, support vector machine, and the like. Another example is deep learning in which a neural network is used to generate feature values and connection weighting coefficients for learning. As appropriate, any of the above algorithms can be used and applied to the following embodiments and modifications. Also, teacher data refers to learning data, and is composed of a pair of input data and output data. Further, correct data means output data of learning data (teaching data).

なお、学習済モデルとは、ディープラーニング等の任意の機械学習アルゴリズムに従った機械学習モデルに対して、事前に適切な教師データ（学習データ）を用いてトレーニング（学習）を行ったモデルをいう。ただし、学習済モデルは、事前に適切な学習データを用いて得ているが、それ以上の学習を行わないものではなく、追加の学習を行うこともできるものとする。追加学習は、装置が使用先に設置された後も行われることができる。 A trained model is a model that has undergone training (learning) in advance using appropriate teacher data (learning data) for a machine learning model that follows any machine learning algorithm such as deep learning. . However, although the trained model is obtained in advance using appropriate training data, it is not the case that further learning is not performed, and additional learning can be performed. Additional learning can also take place after the device has been installed at the point of use.

（第１の実施形態）
以下、図１乃至１３を参照して、本発明の第１の実施形態に係る画像処理システムについて詳細に説明する。本実施形態では、画像処理システムの例として、ＯＣＴによって取得した被検体の断層画像に対して処理を行う画像処理装置を備えるＯＣＴ装置について説明する。 (First embodiment)
An image processing system according to a first embodiment of the present invention will be described in detail below with reference to FIGS. 1 to 13. FIG. In this embodiment, as an example of an image processing system, an OCT apparatus including an image processing apparatus that processes a tomographic image of a subject obtained by OCT will be described.

（画像処理装置の構成）
本実施形態に係る画像処理装置１０１の構成と他機器との接続について図１を参照して説明する。画像処理装置１０１は、撮影装置１００、外部記憶装置１０２、出力部１０３、及び入力部１０４に接続されている。なお、これらの接続は、有線接続であってもよいし、無線接続であってもよい。また、これらの接続はネットワークを介した接続であってもよい。例えば、画像処理装置１０１はインターネット等のネットワークを介して撮影装置１００に接続されてよい。また、例えば、外部記憶装置１０２をインターネット等のネットワーク上に置き、データを複数の画像処理装置で共有できるように構成してもよい。 (Configuration of image processing device)
The configuration of the image processing apparatus 101 according to the present embodiment and connection with other devices will be described with reference to FIG. The image processing apparatus 101 is connected to the photographing apparatus 100 , the external storage device 102 , the output section 103 and the input section 104 . These connections may be wired connections or wireless connections. Also, these connections may be connections via a network. For example, the image processing device 101 may be connected to the imaging device 100 via a network such as the Internet. Further, for example, the external storage device 102 may be placed on a network such as the Internet so that data can be shared by a plurality of image processing apparatuses.

画像処理装置１０１には、機能ブロックである、取得部１０１－１、撮影制御部１０１－２、画像処理部１０１－４、及び表示制御部１０１－５、並びに記憶部１０１－３が設けられている。画像処理装置１０１は、プロセッサーやメモリ等を含む一般的なコンピュータを用いて構成することができるが、ＯＣＴ装置の専用のコンピュータとして構成されてもよい。ここで、画像処理装置１０１は、ＯＣＴ装置の内蔵（内部）のコンピュータであってもよいし、ＯＣＴ装置が通信可能に接続された別体（外部）のコンピュータであってもよい。また、画像処理装置１０１は、例えば、パーソナルコンピュータであってもよく、デスクトップＰＣや、ノート型ＰＣ、タブレット型ＰＣ（携帯型の情報端末）が用いられてもよい。なお、プロセッサーは、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）であってよい。また、プロセッサーは、例えば、ＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＧＰＵ（ＧｒａｐｈｉｃａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）やＦＰＧＡ（Ｆｉｅｌｄ－ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等であってもよい。 The image processing apparatus 101 is provided with an acquisition unit 101-1, an imaging control unit 101-2, an image processing unit 101-4, a display control unit 101-5, and a storage unit 101-3, which are functional blocks. there is The image processing apparatus 101 can be configured using a general computer including a processor, memory, etc., but may be configured as a dedicated computer for the OCT apparatus. Here, the image processing apparatus 101 may be a built-in (internal) computer of the OCT apparatus, or may be a separate (external) computer communicably connected to the OCT apparatus. Also, the image processing apparatus 101 may be, for example, a personal computer, and may be a desktop PC, a notebook PC, or a tablet PC (portable information terminal). Note that the processor may be a CPU (Central Processing Unit). Also, the processor may be, for example, an MPU (Micro Processing Unit), a GPU (Graphical Processing Unit), or an FPGA (Field-Programmable Gate Array).

画像処理装置１０１の各機能は、ＣＰＵやＭＰＵ等のプロセッサーが記憶部１０１－３に記憶されたソフトウェアモジュールを実行することで実現されてよい。なお、プロセッサーは、例えば、ＧＰＵやＦＰＧＡ等であってもよい。また、各機能は、ＡＳＩＣ等の特定の機能を果たす回路等によって構成されてもよい。例えば、画像処理部１０１－４をＡＳＩＣ等の専用のハードウェアで実現してもよいし、表示制御部１０１－５をＣＰＵとは異なるＧＰＵ等の専用のプロセッサーを用いて実現してもよい。記憶部１０１－３は、例えば、ハードディスク等の光学ディスクやメモリ等の任意の記憶媒体によって構成されてよい。 Each function of the image processing apparatus 101 may be implemented by a processor such as a CPU or MPU executing software modules stored in the storage unit 101-3. Note that the processor may be, for example, a GPU, FPGA, or the like. Also, each function may be configured by a circuit or the like that performs a specific function, such as an ASIC. For example, the image processing unit 101-4 may be realized by dedicated hardware such as ASIC, and the display control unit 101-5 may be realized by using a dedicated processor such as a GPU different from the CPU. The storage unit 101-3 may be configured by an arbitrary storage medium such as an optical disk such as a hard disk or a memory, for example.

取得部１０１－１は、被検体を撮影装置１００により撮影して得たＳＬＯ像や断層像、前眼部像等の信号データやＳＬＯ画像や断層画像、前眼観察画像等の画像、患者情報等を取得する機能ブロックである。また、取得部１０１－１は、取得した信号データを用いてＳＬＯ画像や断層画像、前眼観察画像等の画像を生成することができる。取得部１０１－１には、断層画像生成部１０１－１１及びモーションコントラストデータ生成部１０１－１２が設けられている。なお、取得部１０１－１は、画像処理装置１０１に接続されたサーバ等の不図示の外部装置から画像を含む各種データを取得してもよい。ここで、画像処理装置１０１と不図示の外部装置はインターネット等の任意のネットワークを介して接続されていてもよい。また、取得部１０１－１は、記憶部１０１－３に記憶された各種データを取得してもよい。 The acquisition unit 101-1 acquires signal data such as an SLO image, a tomographic image, and an anterior segment image obtained by imaging the subject with the imaging device 100, an image such as an SLO image, a tomographic image, and an anterior observation image, and patient information. It is a functional block that acquires Further, the acquisition unit 101-1 can generate an image such as an SLO image, a tomographic image, or an anterior observation image using the acquired signal data. The acquisition unit 101-1 is provided with a tomographic image generation unit 101-11 and a motion contrast data generation unit 101-12. The acquisition unit 101-1 may acquire various data including images from an external device (not shown) such as a server connected to the image processing apparatus 101. FIG. Here, the image processing apparatus 101 and an external device (not shown) may be connected via any network such as the Internet. Acquisition unit 101-1 may acquire various data stored in storage unit 101-3.

断層画像生成部１０１－１１は、取得された断層像の信号データ（干渉信号）に信号処理を行って断層画像を生成し、生成した断層画像を記憶部１０１－３に記憶させる。なお、干渉信号から断層画像を生成する手法は、公知の任意の手法を用いてよい。 The tomographic image generating unit 101-11 performs signal processing on the acquired signal data (interference signal) of the tomographic image to generate a tomographic image, and stores the generated tomographic image in the storage unit 101-3. Any known method may be used for generating a tomographic image from an interference signal.

モーションコントラストデータ生成部１０１－１２は、断層画像生成部１０１－１１が生成した略同一位置（被検体における互いに対応する領域）の複数の断層画像に基づいて、モーションコントラストデータを生成する。以下、モーションコントラストデータの生成方法について説明する。 The motion contrast data generation unit 101-12 generates motion contrast data based on a plurality of tomographic images at approximately the same position (mutually corresponding regions of the subject) generated by the tomographic image generation unit 101-11. A method of generating motion contrast data will be described below.

まず、断層画像生成部１０１－１１が、被検体の略同一位置を複数回撮影して取得した複数の干渉信号から、複数の断層画像を生成する。ここで、モーションコントラストデータを生成するために被検体の略同一位置を複数回撮影する際の測定光の走査群を１クラスタという。また、被検体の同一位置を複数回撮影して取得した複数の干渉信号に対応する複数の断層画像を１クラスタ（１群）分の断層画像という。より具体的には、断層画像生成部１０１－１１は、取得部１０１－１が取得した複数の干渉信号に対して、波数変換、高速フーリエ変換（ＦＦＴ：ＦａｓｔＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ）、及び絶対値変換（振幅の取得）を行うことで、１クラスタ分の断層画像を生成する。なお、断層画像の生成方法はこれに限られず、その他の画像処理を行う公知の任意の手法を用いてもよい。 First, the tomographic image generation unit 101-11 generates a plurality of tomographic images from a plurality of interference signals acquired by photographing substantially the same position of the subject a plurality of times. Here, a scanning group of measurement light when photographing substantially the same position of the object a plurality of times to generate motion contrast data is referred to as one cluster. Also, a plurality of tomographic images corresponding to a plurality of interference signals acquired by imaging the same position of the subject a plurality of times is called a tomographic image for one cluster (one group). More specifically, the tomographic image generating unit 101-11 performs wavenumber transform, fast Fourier transform (FFT), and absolute value transform ( Acquisition of amplitude) is performed to generate a tomographic image for one cluster. Note that the method of generating a tomographic image is not limited to this, and any other known image processing method may be used.

次に、後述する画像処理部１０１－４の位置合わせ部１０１－４１は、同一クラスタに属する断層画像同士を位置合わせし、重ね合わせ処理を行う。その後、後述する画像処理部１０１－４の画像特徴取得部１０１－４４が、該重ね合わせ断層画像から層境界データを取得する。本実施形態では層境界の取得法として可変形状モデルを用いるが、公知の任意の層境界取得手法を用いてもよい。なお、層境界の取得処理は必須ではなく、例えばモーションコントラスト画像の生成を三次元のみで行い、深度方向に投影した二次元のモーションコントラスト画像を生成しない場合には層境界の取得処理は省略できる。 Next, the alignment unit 101-41 of the image processing unit 101-4, which will be described later, aligns the tomographic images belonging to the same cluster and performs superimposition processing. After that, the image feature acquisition unit 101-44 of the image processing unit 101-4, which will be described later, acquires layer boundary data from the superimposed tomographic image. In this embodiment, a deformable model is used as a layer boundary acquisition method, but any known layer boundary acquisition method may be used. Note that the layer boundary acquisition process is not essential. For example, if the motion contrast image is generated only in three dimensions and a two-dimensional motion contrast image projected in the depth direction is not generated, the layer boundary acquisition process can be omitted. .

モーションコントラストデータ生成部１０１－１２は、同一クラスタ内の隣接する断層画像間でモーションコントラストデータを算出する。本実施形態では、モーションコントラストデータ生成部１０１－１２は、モーションコントラストデータとして脱相関値Ｍｘｚを以下の式（１）に基づき求める。 The motion contrast data generator 101-12 calculates motion contrast data between adjacent tomographic images within the same cluster. In this embodiment, the motion contrast data generator 101-12 obtains a decorrelation value Mxz as motion contrast data based on the following equation (1).

ここで、Ａｘｚは断層画像データＡの位置（ｘ，ｚ）における（ＦＦＴ処理後の複素数データの）振幅、Ｂｘｚは断層データＢの同一位置（ｘ，ｚ）における振幅を示している。０≦Ｍｘｚ≦１であり、両振幅値の差異が大きいほど１に近い値をとる。モーションコントラストデータ生成部１０１－１２は、式（１）のような脱相関演算処理を（同一クラスタに属する）任意の時間的に隣接する断層画像間で行う。なお、脱相関演算処理を行う断層画像は所定の期間内の画像であれば、必ずしも時間的に隣接していなくてもよい。モーションコントラストデータ生成部１０１－１２は、得られた（１クラスタあたりの断層画像数－１）個のモーションコントラストデータの値の平均を画素値として持つ画像を最終的なモーションコントラスト画像として生成する。 Here, Axz indicates the amplitude (of complex number data after FFT processing) at position (x, z) of tomographic image data A, and Bxz indicates the amplitude of tomographic data B at the same position (x, z). 0≦Mxz≦1, and takes a value closer to 1 as the difference between the two amplitude values increases. The motion contrast data generation unit 101-12 performs decorrelation calculation processing such as Equation (1) between arbitrary temporally adjacent tomographic images (belonging to the same cluster). It should be noted that the tomographic images to be subjected to decorrelation arithmetic processing do not necessarily have to be temporally adjacent as long as they are images within a predetermined period. The motion contrast data generator 101-12 generates, as a final motion contrast image, an image having, as a pixel value, the average value of the obtained motion contrast data (number of tomographic images per cluster−1).

なお、ここではＦＦＴ処理後の複素数データの振幅に基づいてモーションコントラストデータを計算したが、モーションコントラストデータの計算法は上記に限定されない。例えば、複素数データの位相情報に基づいてモーションコントラストデータを計算してもよいし、振幅と位相の両方の情報に基づいてモーションコントラストデータを計算してもよい。あるいは、複素数データの実部や虚部に基づいてモーションコントラストデータを計算してもよい。 Although motion contrast data is calculated here based on the amplitude of complex number data after FFT processing, the motion contrast data calculation method is not limited to the above. For example, motion contrast data may be calculated based on the phase information of the complex data, or may be calculated based on both amplitude and phase information. Alternatively, motion contrast data may be calculated based on the real and imaginary parts of complex data.

また、本実施形態ではモーションコントラストデータとして脱相関値を計算したが、モーションコントラストデータの計算法はこれに限定されず、公知の任意の計算法を用いてよい。例えば、二つの値の差分に基づいてモーションコントラストデータを計算してもよいし、二つの値の比に基づいてモーションコントラストデータを計算してもよい。また、モーションコントラストデータは、例えば、２枚の断層画像又はこれに対応する干渉信号間の分散値、又は最大値を最小値で割った値（最大値／最小値）として求めることもできる。これらの算出方法は、公知の任意の方法を用いてよい。 Also, in the present embodiment, the decorrelation value is calculated as the motion contrast data, but the calculation method of the motion contrast data is not limited to this, and any known calculation method may be used. For example, motion contrast data may be calculated based on the difference between two values, or motion contrast data may be calculated based on the ratio of two values. Motion contrast data can also be obtained, for example, as a variance value between two tomographic images or their corresponding interference signals, or as a value obtained by dividing the maximum value by the minimum value (maximum value/minimum value). Any known method may be used for these calculation methods.

また、略同一位置を測定光が複数回走査されるように走査手段を制御する際に、１つの走査（１つのＢスキャン）と次の走査（次のＢスキャン）との時間間隔（タイムインターバル）が変更（決定）されるように構成されてもよい。これにより、例えば、血管の状態によって血流速度が異なる場合があっても、血管領域を精度よく可視化することができる。 Also, when controlling the scanning means so that the measurement light scans approximately the same position a plurality of times, the time interval between one scan (one B scan) and the next scan (next B scan) ) may be changed (determined). As a result, for example, even if the blood flow velocity varies depending on the state of the blood vessel, the blood vessel region can be visualized with high accuracy.

このとき、例えば、操作者（検者）からの指示に応じて、上記時間間隔が変更可能に構成されてもよい。また、例えば、操作者からの指示に応じて、予め設定されている複数の時間間隔に対応する複数のモーションコントラスト画像から、いずれかのモーションコントラスト画像が選択可能に構成されてもよい。さらに、例えば、モーションコントラストデータを取得した際の時間間隔と該モーションコントラストデータとを対応づけて記憶部１０１－３に記憶可能に構成されてもよい。また、例えば、表示制御部１０１－５が、モーションコントラストデータを取得した際の時間間隔と該モーションコントラストデータに対応するモーションコントラスト画像とを出力部１０３に表示させてもよい。さらに、例えば、上記時間間隔が自動的に決定、あるいは上記時間間隔の少なくとも１つの候補が決定されるように構成されてもよい。このとき、例えば、機械学習モデルを用いて、モーションコントラスト画像から、上記時間間隔が決定（出力）されるように構成されてもよい。このような機械学習モデルは、例えば、複数の時間間隔に対応する複数のモーションコントラスト画像を入力データとし、該複数の時間間隔から所望のモーションコントラスト画像を取得した際の時間間隔までの差を正解データとする学習データを学習することにより得ることができる。 At this time, for example, the time interval may be configured to be changeable according to an instruction from an operator (examiner). Further, for example, any motion contrast image may be selectable from a plurality of motion contrast images corresponding to a plurality of preset time intervals according to an instruction from the operator. Further, for example, the time interval when the motion contrast data is acquired and the motion contrast data may be associated with each other and stored in the storage unit 101-3. Further, for example, the display control unit 101-5 may cause the output unit 103 to display the time interval when the motion contrast data is acquired and the motion contrast image corresponding to the motion contrast data. Further, for example, the time interval may be determined automatically, or at least one candidate for the time interval may be determined. At this time, for example, a machine learning model may be used to determine (output) the time interval from the motion contrast image. Such a machine learning model, for example, uses a plurality of motion contrast images corresponding to a plurality of time intervals as input data, and corrects the difference from the plurality of time intervals to the time interval when the desired motion contrast image is acquired. It can be obtained by learning learning data to be data.

さらに、ここでは、取得された複数の脱相関値の平均値を求めることで最終的なモーションコントラスト画像の画素値を得ているが、最終的な画素値はこれに限定されない。例えば、取得された複数の脱相関値の中央値、又は最大値を画素値として持つ画像を最終的なモーションコントラスト画像として生成してもよい。 Furthermore, here, the final pixel value of the motion contrast image is obtained by calculating the average value of the obtained multiple decorrelation values, but the final pixel value is not limited to this. For example, an image having, as a pixel value, the median value or the maximum value of a plurality of acquired decorrelation values may be generated as the final motion contrast image.

撮影制御部１０１－２は、撮影装置１００に対する撮影制御を行う機能ブロックである。例えば、撮影制御部１０１－２は、撮影装置１００に含まれる光源、走査部、及び焦点合わせ用のレンズの駆動装置等の駆動制御を行うことができる。撮影制御部１０１－２は、後述する撮影装置１００のステージ部１００－２等のアライメント動作の制御を行うことができる。さらに、撮影制御部１０１－２は、撮影装置１００に対して撮影パラメータの設定に関して指示したり、撮影の開始又は終了に関して指示したりすることもできる。 The imaging control unit 101 - 2 is a functional block that controls imaging with respect to the imaging apparatus 100 . For example, the imaging control unit 101-2 can drive and control a light source, a scanning unit, a focusing lens driving device, and the like included in the imaging apparatus 100. FIG. The imaging control unit 101-2 can control the alignment operation of the stage unit 100-2 of the imaging apparatus 100, which will be described later. Furthermore, the imaging control unit 101-2 can also instruct the imaging apparatus 100 to set imaging parameters and to start or end imaging.

記憶部１０１－３は、オペレーティングシステム（ＯＳ）、周辺機器のデバイスドライバ、及び後述する処理等を行うためのプログラムを含む各種アプリケーションソフトを実現するためのプログラムを記憶することができる。また、記憶部１０１－３は、取得部１０１－１によって取得された情報や、画像処理部１０１－４で処理された各種画像等を記憶することもできる。例えば、記憶部１０１－３は、取得部１０１－１により取得された断層画像等の医用画像を記憶したり、後述する高画質化部１０１－４７で高画質化された画像を記憶したりすることができる。 The storage unit 101-3 can store an operating system (OS), device drivers for peripheral devices, and programs for realizing various types of application software including programs for performing processes described later. The storage unit 101-3 can also store information acquired by the acquisition unit 101-1, various images processed by the image processing unit 101-4, and the like. For example, the storage unit 101-3 stores a medical image such as a tomographic image acquired by the acquisition unit 101-1, or stores an image enhanced in image quality by an image enhancement unit 101-47, which will be described later. be able to.

画像処理部１０１－４は、断層画像やモーションコントラスト画像、ＳＬＯ画像、前眼観察画像等の各種画像について画像処理を行う機能ブロックである。画像処理部１０１－４には、位置合わせ部１０１－４１、合成部１０１－４２、補正部１０１－４３、画像特徴取得部１０１－４４、投影部１０１－４５、解析部１０１－４６、及び高画質化部１０１－４７が設けられている。 The image processing unit 101-4 is a functional block that performs image processing on various images such as tomographic images, motion contrast images, SLO images, and anterior eye observation images. The image processing unit 101-4 includes a positioning unit 101-41, a synthesizing unit 101-42, a correcting unit 101-43, an image feature acquiring unit 101-44, a projecting unit 101-45, an analyzing unit 101-46, and an image processing unit 101-46. An image quality improving unit 101-47 is provided.

位置合わせ部１０１－４１は、画像間の位置合わせ処理を行う機能ブロックである。例えば、位置合わせ部１０１－４１は、同一クラスタに属する断層画像同士を位置合わせし、重ね合わせ処理を行うことができる。なお、位置合わせ処理や重ね合わせ処理は、公知の任意の手法を用いて行われてよい。 The alignment unit 101-41 is a functional block that performs alignment processing between images. For example, the aligning unit 101-41 can align tomographic images belonging to the same cluster and perform overlay processing. Note that the alignment processing and superimposition processing may be performed using any known technique.

合成部１０１－４２は、各複数の二次元画像から１つの画像を合成する機能ブロックである。合成部１０１－４２には、例えば、合成法指定部１０１－４２１、同一モダリティ画像合成部１０１－４２２、及び異種モダリティ画像合成部１０１－４２３が設けられている。 The synthesizer 101-42 is a functional block that synthesizes one image from each of a plurality of two-dimensional images. The synthesizing unit 101-42 is provided with, for example, a synthesizing method specifying unit 101-421, a same modality image synthesizing unit 101-422, and a different modality image synthesizing unit 101-423.

合成法指定部１０１－４２１は、合成対象画像の種類（例えば、断層画像／モーションコントラスト画像／断層画像及びモーションコントラスト画像）と、合成処理法（例えば、重ね合わせ／貼り合わせ／並置表示）を指定する。同一モダリティ画像合成部１０１－４２２は、例えば、断層画像間、又はモーションコントラスト画像間の合成処理を行う。異種モダリティ画像合成部１０１－４２３は、例えば、断層画像とモーションコントラスト画像間等の異なる種類のモダリティで得られた画像間の合成処理を行う。 The combining method designating unit 101-421 designates the type of image to be combined (eg, tomographic image/motion contrast image/tomographic image and motion contrast image) and the combining processing method (eg, superimposition/stitching/side-by-side display). do. The same modality image synthesizing unit 101-422 performs synthesizing processing between tomographic images or between motion contrast images, for example. The heteromodality image synthesizing unit 101-423 performs synthesizing processing between images obtained by different modalities, such as between a tomographic image and a motion contrast image.

補正部１０１－４３は、モーションコントラスト画像内に生じるプロジェクションアーチファクトを抑制する処理を行う機能ブロックである。ここで、プロジェクションアーチファクトは、網膜表層血管内のモーションコントラストが深層側（網膜深層や網膜外層、脈絡膜）に映り込み、実際には血管の存在しない深層側の領域に高い脱相関値が生じる現象を指す。例えば、補正部１０１－４３は、生成されたモーションコントラストデータにおけるプロジェクションアーチファクトを低減する処理を行う。このため、補正部１０１－４３は、生成されたモーションコントラストデータに対してプロジェクションアーチファクトを低減する処理を行う処理手段の一例に相当する。 The correction unit 101-43 is a functional block that performs processing for suppressing projection artifacts that occur in motion contrast images. Here, the projection artifact is a phenomenon in which the motion contrast in the superficial retinal blood vessels is reflected on the deep layer side (deep retina, outer retina, and choroid), and a high decorrelation value occurs in the deep layer region where no blood vessels actually exist. Point. For example, the corrector 101-43 performs processing to reduce projection artifacts in the generated motion contrast data. Therefore, the correction unit 101-43 corresponds to an example of a processing unit that performs processing for reducing projection artifacts on the generated motion contrast data.

画像特徴取得部１０１－４４は、断層画像やモーションコントラスト断層画像等から層境界データを取得する。具体的には、画像特徴取得部１０１－４４は、断層画像等について画像セグメンテーション処理を行い、被検眼の断層における層構造を抽出し、境界位置等の層境界データを取得する。なお、画像セグメンテーション処理の手法は、公知の任意の手法を用いてよい。 The image feature acquisition unit 101-44 acquires layer boundary data from a tomographic image, a motion contrast tomographic image, or the like. Specifically, the image feature acquisition unit 101-44 performs image segmentation processing on a tomographic image or the like, extracts the layer structure in the tomogram of the subject's eye, and acquires layer boundary data such as boundary positions. Note that any known method may be used as the image segmentation processing method.

投影部１０１－４５は、設定された深度範囲で断層画像又はモーションコントラスト画像を投影又は積算し、輝度正面画像（輝度Ｅｎ－Ｆａｃｅ画像）又はＯＣＴＡ正面画像を生成する機能ブロックである。なお、深度範囲は、操作者の指示や画像特徴取得部１０１－４４が取得した層境界データに基づく２つの基準面を用いて設定されることができる。設定される深度範囲は任意の深度範囲であってよい。例えば、網膜表層及び網膜外層の深度範囲を設定し、合成部１０１－４２によって２種類の合成ＯＣＴＡ正面画像を生成することができる。 The projection unit 101-45 is a functional block that projects or integrates a tomographic image or motion contrast image in a set depth range to generate a luminance front image (luminance En-Face image) or an OCTA front image. Note that the depth range can be set using two reference planes based on an operator's instruction or layer boundary data acquired by the image feature acquisition unit 101-44. The set depth range may be any depth range. For example, it is possible to set the depth ranges of the surface layer of the retina and the outer layer of the retina, and generate two types of synthetic OCTA enface images by the synthesizing unit 101-42.

ここで、２つの基準面に基づいて設定された深度範囲に対応するデータを二次元平面に投影する手法としては、例えば、当該深度範囲内のデータの代表値を二次元平面上の画素値とする手法を用いることができる。代表値は、２つの基準面に囲まれた領域の深さ方向の範囲内における画素値の平均値、中央値又は最大値などの値を含むことができる。具体的には、投影法として、例えば、最大値投影（ＭＩＰ：ＭａｘｉｍｕｍＩｎｔｅｎｓｉｔｙＰｒｏｊｅｃｔｉｏｎ）又は平均値投影（ＡＩＰ：ＡｖｅｒａｇｅＩｎｔｅｎｓｉｔｙＰｒｏｊｅｃｔｉｏｎ）等を選択することができる。 Here, as a method of projecting data corresponding to a depth range set based on two reference planes onto a two-dimensional plane, for example, a representative value of data within the depth range is used as a pixel value on the two-dimensional plane. method can be used. The representative value can include a value such as an average value, a median value, or a maximum value of pixel values within a range in the depth direction of the area surrounded by the two reference planes. Specifically, as a projection method, for example, maximum intensity projection (MIP) or average intensity projection (AIP) can be selected.

また、正面画像に係る深度範囲は、例えば、検出された網膜層に関する２つの層境界の一方を基準として、より深い方向又はより浅い方向に所定の画素数分だけ含んだ範囲であってもよい。また、Ｅｎ－Ｆａｃｅ画像に係る深度範囲は、例えば、検出された網膜層に関する２つの層境界の間の範囲から、操作者の指示に応じて変更された（オフセットされた）範囲であってもよい。ここで、正面画像を生成するための深度範囲は、不図示の選択リスト等に表示された既定の深度範囲セットから操作者が選択することにより変更することができる。また、操作者は、投影範囲の指定に用いる層境界の種類とオフセット位置をユーザーインターフェース（ＵＩ）から変更したり、入力部１０４を操作して断層画像上に重畳された層境界データを移動させたりすることで、投影範囲を変更することもできる。 Also, the depth range of the front image may be, for example, a range including a predetermined number of pixels in a deeper direction or a shallower direction with reference to one of the two layer boundaries of the detected retinal layer. . Further, the depth range related to the En-Face image is, for example, a range changed (offset) according to the operator's instruction from the range between two layer boundaries related to the detected retinal layer. good. Here, the depth range for generating the front image can be changed by the operator selecting from a default depth range set displayed in a selection list or the like (not shown). In addition, the operator can change the type and offset position of the layer boundary used to specify the projection range from the user interface (UI), or operate the input unit 104 to move the layer boundary data superimposed on the tomographic image. You can also change the projection range by

生成された輝度正面画像やＯＣＴＡ正面画像は、出力部１０３によって出力されることができる。なお、出力部１０３に出力されるモーションコントラスト画像は、ＯＣＴＡ正面画像に限定されるものではなく、三次元的にレンダリングした三次元モーションコントラスト画像や断層画像に対応するモーションコントラスト断層画像であってもよい。なお、生成されたモーションコントラスト画像等が出力部１０３によって表示される場合、上述した投影法やプロジェクションアーチファクト抑制処理の有無を、例えばコンテキストメニューのようなＵＩから選択することにより変更してもよい。例えば、プロジェクションアーチファクト抑制処理後のモーションコントラスト画像を三次元画像として表示してもよい。 The generated luminance front image and OCTA front image can be output by the output unit 103 . The motion contrast image output to the output unit 103 is not limited to the OCTA front image, and may be a three-dimensional rendered three-dimensional motion contrast image or a motion contrast tomographic image corresponding to a tomographic image. good. When the generated motion contrast image or the like is displayed by the output unit 103, the above-described projection method and the presence or absence of projection artifact suppression processing may be changed by selecting from a UI such as a context menu. For example, a motion contrast image after projection artifact suppression processing may be displayed as a three-dimensional image.

また、解析部１０１－４６は、断層画像やモーションコントラスト画像等の各種画像の解析処理を行う機能ブロックである。解析部１０１－４６には、強調部１０１－４６１、抽出部１０１－４６２、計測部１０１－４６３、及び比較部１０１－４６４が設けられている。 The analysis unit 101-46 is a functional block that analyzes various images such as tomographic images and motion contrast images. The analysis unit 101-46 is provided with an enhancement unit 101-461, an extraction unit 101-462, a measurement unit 101-463, and a comparison unit 101-464.

強調部１０１－４６１は、例えば、操作者による指示に応じて、画像中の任意の領域のデータを強調する強調処理を行うことができる。抽出部１０１－４６２は、解析部１０１－４６によって用いられる画像における特徴部や領域等を抽出することができる。例えば、抽出部１０１－４６２は、断層画像から網膜や脈絡膜の層境界、篩状板の前面や後面の境界、中心窩や視神経乳頭中心の位置等を取得することができる。また、抽出部１０１－４６２は、ＯＣＴＡ正面画像から血管領域を抽出することができる。計測部１０１－４６３は、例えば、各種画像から解析すべき計測値を算出する。例えば、計測部１０１－４６３は特定の層の厚さを算出したり、抽出された該血管領域や該血管領域を細線化することで取得した血管中心線データを用いて血管密度等の計測値を算出したりすることができる。また、比較部１０１－４６４は、複数の断層画像や複数のモーションコントラスト画像等の複数の画像を比較することができる。さらに、比較部１０１－４６４は、計測部１０１－４６３によって算出された計測値等の複数の画像の解析結果を比較することもできる。 The enhancement unit 101-461 can, for example, perform enhancement processing for enhancing data in an arbitrary region in an image in accordance with instructions from the operator. The extraction unit 101-462 can extract features, regions, etc. in the image used by the analysis unit 101-46. For example, the extraction unit 101-462 can acquire the layer boundaries of the retina and choroid, the boundaries of the anterior and posterior surfaces of the cribriform plate, the positions of the fovea fovea and the center of the optic disc, and the like from the tomographic image. Also, the extraction unit 101-462 can extract a blood vessel region from the OCTA front image. The measurement unit 101-463, for example, calculates measurement values to be analyzed from various images. For example, the measurement unit 101-463 calculates the thickness of a specific layer, or uses the blood vessel center line data obtained by thinning the extracted blood vessel region or the blood vessel region to obtain measured values such as blood vessel density. can be calculated. Also, the comparison unit 101-464 can compare multiple images such as multiple tomographic images and multiple motion contrast images. Further, the comparison unit 101-464 can also compare analysis results of multiple images, such as measurement values calculated by the measurement unit 101-463.

高画質化部１０１－４７は、各種画像を高画質化する機能ブロックである。このため、高画質化部１０１－４７は、例えば、断層画像やＯＣＴＡ正面画像等の医用画像を高画質化する高画質化部の一例である。なお、高画質化部１０１－４７の構成及び機能の詳細については後述する。 The image quality enhancing unit 101-47 is a functional block that enhances the image quality of various images. Therefore, the image quality enhancing unit 101-47 is an example of an image quality enhancing unit that enhances the image quality of medical images such as tomographic images and OCTA front images. Details of the configuration and functions of the image quality enhancing unit 101-47 will be described later.

表示制御部１０１－５は、画像処理装置１０１に接続された出力部１０３の表示を制御することができる。表示制御部１０１－５は、例えば、出力部１０３に被検眼の断層画像や患者の情報等の各種データ等を表示させることができる。 The display control unit 101-5 can control the display of the output unit 103 connected to the image processing apparatus 101. FIG. The display control unit 101-5 can cause the output unit 103 to display, for example, various data such as a tomographic image of the subject's eye and patient information.

外部記憶装置１０２は、断層撮影用のプログラムや患者情報、画像情報等を記憶することができる。例えば、外部記憶装置１０２は、患者情報（患者の氏名、年齢、及び性別等）や被検眼の情報（左眼、右眼、眼軸長等）と、撮影した画像（断層画像、ＳＬＯ画像、及びＯＣＴＡ画像等）や合成画像、撮影パラメータ、過去検査の画像データや計測データ、操作者が設定したパラメータなどとを関連付けて記憶することができる。 The external storage device 102 can store programs for tomography, patient information, image information, and the like. For example, the external storage device 102 stores patient information (patient name, age, sex, etc.), eye information (left eye, right eye, eye axial length, etc.), captured images (tomographic images, SLO images, and OCTA images, etc.), composite images, imaging parameters, image data and measurement data of past examinations, parameters set by the operator, and the like can be stored in association with each other.

入力部１０４は、例えば、マウスやキーボード、タッチパネル等を含んで構成され、操作者は、入力部１０４を介して、画像処理装置１０１や撮影装置１００へ指示を行うことができる。出力部１０３は、画像処理装置１０１で処理した各種画像等のデータを出力することができる。出力部１０３は、任意のディスプレイ等で構成されることができ、表示制御部１０１－５による制御に基づいて、例えば被検体に関する患者情報や各種画像等を表示することができる。この場合、出力部１０３は、各種画像や情報を表示する表示部の一例として機能することができる。なお、出力部１０３はタッチＵＩ等を備えていてもよい。 The input unit 104 includes, for example, a mouse, a keyboard, a touch panel, and the like, and the operator can give instructions to the image processing apparatus 101 and the image capturing apparatus 100 via the input unit 104 . The output unit 103 can output data such as various images processed by the image processing apparatus 101 . The output unit 103 can be composed of an arbitrary display or the like, and can display, for example, patient information and various images related to the subject under the control of the display control unit 101-5. In this case, the output unit 103 can function as an example of a display unit that displays various images and information. Note that the output unit 103 may have a touch UI or the like.

なお、ＯＣＴ装置の一部の構成を別個の装置として構成してもよいし、一体的な装置として構成してもよい。例えば、出力部１０３をタッチパネル式のディスプレイとして入力部１０４と一体的に構成してもよい。 A part of the configuration of the OCT apparatus may be configured as a separate device, or may be configured as an integrated device. For example, the output unit 103 may be integrated with the input unit 104 as a touch panel display.

（撮影装置の構成）
撮影装置１００は、被検眼の断層画像やＳＬＯ画像等を得るために被検眼を撮影する装置である。以下、本実施形態に係る撮影装置１００における測定光学系及び分光器の構成について図１及び図２を参照して説明する。本実施形態においては、撮影装置１００としてＳＤ－ＯＣＴ（ＳｐｅｃｔｒａｌＤｏｍａｉｎＯＣＴ）の光学系を含む撮影装置を用いるものとする。これに限らず、例えばＳＳ－ＯＣＴやＴＤ－ＯＣＴ（ＴｉｍｅＤｏｍａｉｎＯＣＴ）等の光学系を含む撮影装置を用いて構成してもよい。 (Structure of imaging device)
The imaging apparatus 100 is an apparatus for imaging an eye to be inspected in order to obtain a tomographic image, an SLO image, or the like of the eye to be inspected. The configuration of the measurement optical system and the spectroscope in the imaging apparatus 100 according to this embodiment will be described below with reference to FIGS. 1 and 2. FIG. In this embodiment, an imaging device including an SD-OCT (Spectral Domain OCT) optical system is used as the imaging device 100 . Not limited to this, for example, an imaging device including an optical system such as SS-OCT or TD-OCT (Time Domain OCT) may be used.

図１に示されるように、撮影装置１００には、測定光学系１００－１、ステージ部１００－２、及びベース部１００－３が設けられている。測定光学系１００－１は前眼部像、被検眼のＳＬＯ眼底像、及び断層像を取得するための光学系である。ステージ部１００－２は、測定光学系１００－１を前後左右に移動可能に保持し、撮影制御部１０１－２の制御に応じて測定光学系１００－１を移動させることができる。ベース部１００－３は、後述の分光器２３０を内蔵している。 As shown in FIG. 1, the imaging device 100 is provided with a measurement optical system 100-1, a stage section 100-2, and a base section 100-3. The measurement optical system 100-1 is an optical system for obtaining an anterior segment image, an SLO fundus image of the subject's eye, and a tomographic image. The stage unit 100-2 holds the measurement optical system 100-1 so as to be movable in the front, rear, left, and right directions, and can move the measurement optical system 100-1 under the control of the imaging control unit 101-2. The base section 100-3 incorporates a spectroscope 230, which will be described later.

図２を参照して、測定光学系１００－１の構成について説明する。測定光学系１００－１では、被検眼２００に対向して対物レンズ２０１が設置され、その光軸上に第１ダイクロイックミラー２０２及び第２ダイクロイックミラー２０３が配置されている。対物レンズ２０１からの光路は、これらのダイクロイックミラーによってＯＣＴ光学系用の光路２５０、ＳＬＯ光学系と固視灯用の光路２５１、及び前眼観察用の光路２５２とに波長帯域ごとに分岐される。 The configuration of the measurement optical system 100-1 will be described with reference to FIG. In the measurement optical system 100-1, an objective lens 201 is installed facing an eye 200 to be inspected, and a first dichroic mirror 202 and a second dichroic mirror 203 are arranged on the optical axis thereof. The optical path from the objective lens 201 is split by these dichroic mirrors into an optical path 250 for the OCT optical system, an optical path 251 for the SLO optical system and fixation lamp, and an optical path 252 for anterior eye observation for each wavelength band. .

本実施形態では、第１ダイクロイックミラー２０２の反射方向に前眼観察用の光路２５２が配置され、第１ダイクロイックミラー２０２の透過方向にＯＣＴ光学系用の光路２５０及びＳＬＯ光学系と固視灯用の光路２５１が配置される。また、第２ダイクロイックミラー２０３の反射方向にＯＣＴ光学系用の光路２５０が配置され、第２ダイクロイックミラー２０３の透過方向にＳＬＯ光学系と固視灯用の光路２５１が配置される。ただし、ダイクロイックミラーに対する各光路の配置はそれぞれ逆であってもよい。 In this embodiment, an optical path 252 for anterior eye observation is arranged in the reflection direction of the first dichroic mirror 202, and an optical path 250 for the OCT optical system and an optical path 250 for the SLO optical system and the fixation lamp are arranged in the transmission direction of the first dichroic mirror 202. of optical paths 251 are arranged. An optical path 250 for the OCT optical system is arranged in the reflection direction of the second dichroic mirror 203 , and an optical path 251 for the SLO optical system and fixation lamp is arranged in the transmission direction of the second dichroic mirror 203 . However, the arrangement of each optical path with respect to the dichroic mirror may be reversed.

ＳＬＯ光学系と固視灯用の光路２５１には、ＳＬＯ走査部２０４、レンズ２０５，２０６、ミラー２０７、第３ダイクロイックミラー２０８、ＡＰＤ（ＡｖａｌａｎｃｈｅＰｈｏｔｏｄｉｏｄｅ）２０９、ＳＬＯ光源２１０、及び固視灯２１１が設けられている。ミラー２０７は、穴あきミラーや中空のミラーが蒸着されたプリズムを用いて構成され、ＳＬＯ光源２１０による照明光と、照明光の被検眼２００からの戻り光とを分離する。第３ダイクロイックミラー２０８は、第３ダイクロイックミラー２０８からの光路をＳＬＯ光源２１０の光路と固視灯２１１の光路とに波長帯域ごとに分岐させる。なお、本実施形態では、第３ダイクロイックミラー２０８の反射方向にＳＬＯ光源２１０の光路が配置され、第３ダイクロイックミラー２０８の透過方向に固視灯２１１が配置されている。ただし、ダイクロイックミラーに対する各光路の配置はそれぞれ逆であってもよい。 An SLO scanning unit 204 , lenses 205 and 206 , a mirror 207 , a third dichroic mirror 208 , an APD (Avalanche Photodiode) 209 , an SLO light source 210 , and a fixation light 211 are included in an optical path 251 for the SLO optical system and the fixation light. is provided. The mirror 207 is configured using a prism in which a perforated mirror or a hollow mirror is vapor-deposited, and separates the illumination light from the SLO light source 210 and the illumination light returned from the eye 200 to be inspected. The third dichroic mirror 208 splits the optical path from the third dichroic mirror 208 into the optical path of the SLO light source 210 and the optical path of the fixation lamp 211 for each wavelength band. In this embodiment, the optical path of the SLO light source 210 is arranged in the reflection direction of the third dichroic mirror 208 and the fixation lamp 211 is arranged in the transmission direction of the third dichroic mirror 208 . However, the arrangement of each optical path with respect to the dichroic mirror may be reversed.

ＳＬＯ走査部２０４は、ＳＬＯ光源２１０から発せられた照明光を被検眼２００上で走査するものであり、Ｘ方向に走査するＸスキャナ、Ｙ方向に走査するＹスキャナから構成されている。ＳＬＯ走査部２０４は、撮影制御部１０１－２によって制御される。本実施形態では、Ｘスキャナは高速走査を行うポリゴンミラーで、Ｙスキャナはガルバノミラーによって構成されている。ただし、ＳＬＯ走査部２０４の構成はこれに限られず、Ｘスキャナ及びＹスキャナは所望の構成に応じて任意の偏向ミラーを用いて構成されてよい。 The SLO scanning unit 204 scans the illumination light emitted from the SLO light source 210 on the subject's eye 200, and is composed of an X scanner that scans in the X direction and a Y scanner that scans in the Y direction. The SLO scanning unit 204 is controlled by the imaging control unit 101-2. In this embodiment, the X scanner is a polygon mirror that performs high-speed scanning, and the Y scanner is a galvanomirror. However, the configuration of the SLO scanning unit 204 is not limited to this, and the X scanner and Y scanner may be configured using arbitrary deflection mirrors according to a desired configuration.

レンズ２０５は、ＳＬＯ光学系及び固視灯２１１の焦点合わせのため、不図示のモータによって光軸方向に駆動される。なお、レンズ２０５を駆動させるためのモータは、撮影制御部１０１－２によって制御される。 The lens 205 is driven along the optical axis by a motor (not shown) for focusing the SLO optical system and the fixation lamp 211 . A motor for driving the lens 205 is controlled by the imaging control unit 101-2.

ＳＬＯ光源２１０は、７８０ｎｍ付近の波長の光を発生する。ＡＰＤ２０９は、被検眼２００からの戻り光を検出する。固視灯２１１は、可視光を発生して被検者の固視を促すものである。 The SLO light source 210 generates light with a wavelength around 780 nm. The APD 209 detects return light from the eye 200 to be examined. The fixation lamp 211 emits visible light to prompt the subject to fixate.

ＳＬＯ光源２１０から発せられた照明光は、第３ダイクロイックミラー２０８で反射され、ミラー２０７を通過し、レンズ２０６，２０５を通ってＳＬＯ走査部２０４に至り、ＳＬＯ走査部２０４によって被検眼２００上で走査される。被検眼２００からの戻り光は、照明光と同じ経路を戻った後、ミラー２０７によって反射され、ＡＰＤ２０９へと導かれる。 Illumination light emitted from the SLO light source 210 is reflected by the third dichroic mirror 208, passes through the mirror 207, passes through the lenses 206 and 205, reaches the SLO scanning unit 204, and is illuminated by the SLO scanning unit 204 on the subject's eye 200. Scanned. Return light from the subject's eye 200 returns along the same path as the illumination light, is reflected by the mirror 207 , and is guided to the APD 209 .

ＡＰＤ２０９は、戻り光に基づいてＳＬＯ眼底像の信号データを生成し、画像処理装置１０１に出力する。画像処理装置１０１の取得部１０１－１はＡＰＤ２０９から出力された信号データに基づいてＳＬＯ画像を生成する。なお、ＳＬＯ光源２１０から発せられた照明光を、被検眼２００の眼底上で走査することで被検眼２００の眼底のＳＬＯ画像を取得することができる。これに対し、ＳＬＯ光源２１０から発せられた照明光を、被検眼の前眼部を走査することで被検眼２００の前眼部のＳＬＯ画像を取得することもできる。 The APD 209 generates SLO fundus image signal data based on the returned light, and outputs the signal data to the image processing apparatus 101 . The acquisition unit 101-1 of the image processing apparatus 101 generates an SLO image based on the signal data output from the APD 209. FIG. An SLO image of the fundus of the eye 200 to be examined can be acquired by scanning the fundus of the eye 200 to be examined with illumination light emitted from the SLO light source 210 . On the other hand, it is also possible to obtain an SLO image of the anterior segment of the eye 200 to be inspected by scanning the anterior segment of the eye to be inspected with illumination light emitted from the SLO light source 210 .

固視灯２１１から発せられた光は、第３ダイクロイックミラー２０８、ミラー２０７を透過し、レンズ２０６，２０５を通り、ＳＬＯ走査部２０４によって被検眼２００上の任意の位置に所定の形状を作る。固視灯２１１から発せられた光を被検者に注視させることで、被検者の固視を促すことができる。 Light emitted from the fixation lamp 211 passes through the third dichroic mirror 208 and the mirror 207, passes through the lenses 206 and 205, and is formed into a predetermined shape at an arbitrary position on the subject's eye 200 by the SLO scanning unit 204. FIG. By making the subject gaze at the light emitted from the fixation lamp 211, the subject's fixation can be encouraged.

前眼観察用の光路２５２には、レンズ２１２，２１３、スプリットプリズム２１４、及び赤外光を検知する前眼部観察用のＣＣＤ２１５が配置されている。ＣＣＤ２１５は、不図示の前眼部観察用照射光の波長、具体的には９７０ｎｍ付近に感度を持つものである。ＣＣＤ２１５から出力されたデータ信号は画像処理装置１０１へ出力され、画像処理装置１０１は入力された信号データに基づいて前眼部観察画像を生成することができる。 Lenses 212 and 213, a split prism 214, and a CCD 215 for anterior eye observation that detects infrared light are arranged in an optical path 252 for anterior eye observation. The CCD 215 has sensitivity to the wavelength of illumination light for observing the anterior eye segment (not shown), specifically around 970 nm. A data signal output from the CCD 215 is output to the image processing device 101, and the image processing device 101 can generate an anterior segment observed image based on the input signal data.

スプリットプリズム２１４は、被検眼２００の瞳孔と共役な位置に配置されている。画像処理装置１０１は、スプリットプリズム２１４を通った光に基づく前眼部のスプリット像を用いて、被検眼２００に対する測定光学系１００－１のＺ軸方向（光軸方向）の距離を検出できる。 The split prism 214 is arranged at a position conjugate with the pupil of the eye 200 to be examined. The image processing apparatus 101 can detect the distance in the Z-axis direction (optical axis direction) of the measurement optical system 100-1 from the eye 200 to be examined using the split image of the anterior segment based on the light that has passed through the split prism 214. FIG.

ＯＣＴ光学系用の光路２５０にはＯＣＴ光学系が設けられており、ＯＣＴ光学系は被検眼２００の断層画像を撮影するために用いられる。より具体的には、ＯＣＴ光学系は、断層画像を生成するための干渉信号を得るために用いられる。ＯＣＴ光学系用の光路２５０には、ＸＹスキャナ２１６、レンズ２１７，２１８、及び光ファイバー２２４が設けられている。 An OCT optical system is provided in the optical path 250 for the OCT optical system, and the OCT optical system is used to capture a tomographic image of the eye 200 to be examined. More specifically, OCT optics are used to obtain interference signals for generating tomographic images. An XY scanner 216, lenses 217 and 218, and an optical fiber 224 are provided in an optical path 250 for the OCT optical system.

ＸＹスキャナ（ＯＣＴ走査部）２１６は、測定光を被検眼２００上で走査するためのものである。ＸＹスキャナ２１６は、撮影制御部１０１－２によって制御される。ＸＹスキャナ２１６は、図２では１枚のミラーとして図示されているが、実際はＸＹ２軸方向の走査を行うガルバノミラーである。ただし、ＸＹスキャナ２１６の構成はこれに限られず、所望の構成に応じて任意の偏向ミラーを用いて構成されてよい。例えば、ＸＹスキャナ２１６は、１枚で２次元方向に測定光を走査可能なＭＥＭＳミラー等の任意の偏向手段を用いて構成されてよい。 An XY scanner (OCT scanning unit) 216 is for scanning the subject's eye 200 with measurement light. The XY scanner 216 is controlled by the imaging control section 101-2. Although the XY scanner 216 is illustrated as one mirror in FIG. 2, it is actually a galvanomirror that performs scanning in the XY two-axis directions. However, the configuration of the XY scanner 216 is not limited to this, and may be configured using any deflection mirrors according to a desired configuration. For example, the XY scanner 216 may be configured using arbitrary deflection means such as a MEMS mirror capable of scanning the measurement light in two-dimensional directions with a single sheet.

レンズ２１７は、光カプラー２１９に接続されている光ファイバー２２４から出射する測定光を、被検眼２００に焦点合わせするために不図示のモータによって光軸方向に駆動される。この焦点合わせによって、測定光の被検眼２００からの戻り光は同時に光ファイバー２２４の先端に、スポット状に結像されて入射されることとなる。なお、レンズ２１７を駆動させるためのモータは、撮影制御部１０１－２によって制御される。 The lens 217 is driven in the optical axis direction by a motor (not shown) in order to focus the measurement light emitted from the optical fiber 224 connected to the optical coupler 219 onto the eye 200 to be examined. By this focusing, the return light of the measurement light from the eye 200 to be inspected is simultaneously focused on the tip of the optical fiber 224 in the form of a spot and is incident thereon. A motor for driving the lens 217 is controlled by the imaging control unit 101-2.

次に、ＯＣＴ光源２２０からの光路と参照光学系、分光器の構成について説明する。ＯＣＴ光学系には、ＯＣＴ光源２２０、参照ミラー２２１、分散補償ガラス２２２、レンズ２２３、光カプラー２１９、光カプラーに接続されて一体化しているシングルモードの光ファイバー２２４～２２７、分光器２３０が更に設けられている。ＯＣＴ光学系は、これらの構成によってマイケルソン干渉計を構成している。 Next, the configuration of the optical path from the OCT light source 220, the reference optical system, and the spectroscope will be described. The OCT optical system further includes an OCT light source 220, a reference mirror 221, a dispersion compensating glass 222, a lens 223, an optical coupler 219, single-mode optical fibers 224 to 227 connected to and integrated with the optical coupler, and a spectroscope 230. It is The OCT optical system constitutes a Michelson interferometer with these configurations.

ＯＣＴ光源２２０は、代表的な低コヒーレント光源であるＳＬＤ（ＳｕｐｅｒＬｕｍｉｎｅｓｃｅｎｔＤｉｏｄｅ）である。中心波長は８５５ｎｍ、波長バンド幅は約１００ｎｍである。ここで、バンド幅は、得られる断層画像の光軸方向の分解能に影響するため、重要なパラメータである。光源の種類は、ここではＳＬＤを選択したが、低コヒーレント光が出射できればよく、ＡＳＥ（ＡｍｐｌｉｆｉｅｄＳｐｏｎｔａｎｅｏｕｓＥｍｉｓｓｉｏｎ）等を用いることができる。中心波長は眼を測定することを鑑みると近赤外光が適する。また、中心波長は得られる断層画像の横方向の分解能に影響するため、なるべく短波長であることができる。本実施形態では、双方の理由からＯＣＴ光源２２０の中心波長を８５５ｎｍとした。 The OCT light source 220 is an SLD (Super Luminescent Diode), which is a typical low coherent light source. The center wavelength is 855 nm and the wavelength bandwidth is about 100 nm. Here, the bandwidth is an important parameter because it affects the resolution of the obtained tomographic image in the optical axis direction. As for the type of light source, SLD is selected here, but ASE (Amplified Spontaneous Emission) or the like may be used as long as it can emit low coherent light. Near-infrared light is suitable for the center wavelength in view of measuring the eye. Also, since the central wavelength affects the lateral resolution of the obtained tomographic image, the wavelength can be as short as possible. In this embodiment, the center wavelength of the OCT light source 220 is set to 855 nm for both reasons.

ＯＣＴ光源２２０から出射された光は、光ファイバー２２５を通じ、光カプラー２１９を介して光ファイバー２２４側に入射される測定光と、光ファイバー２２６側に入射される参照光とに分割される。測定光は、ＯＣＴ光学系用の光路２５０を通じて、観察対象である被検眼２００に照射され、被検眼２００による反射や散乱により同じ光路を通じて戻り光として光カプラー２１９に到達する。一方、参照光は光ファイバー２２６、レンズ２２３、及び測定光と参照光の波長分散を合わせるために挿入された分散補償ガラス２２２を介して参照ミラー２２１に到達し反射される。参照ミラー２２１で反射された参照光は同じ光路を戻り、光カプラー２１９に到達する。 Light emitted from the OCT light source 220 passes through the optical fiber 225 and is split into measurement light that enters the optical fiber 224 side via the optical coupler 219 and reference light that enters the optical fiber 226 side. The measurement light irradiates the eye 200 to be observed, which is an observation target, through the optical path 250 for the OCT optical system, and reaches the optical coupler 219 as return light through the same optical path due to reflection and scattering by the eye 200 . On the other hand, the reference light reaches the reference mirror 221 and is reflected through the optical fiber 226, the lens 223, and the dispersion compensating glass 222 inserted to match the wavelength dispersion of the measurement light and the reference light. The reference light reflected by the reference mirror 221 returns along the same optical path and reaches the optical coupler 219 .

測定光と参照光は、光カプラー２１９によって合波され干渉光となる。ここで、測定光と参照光は、測定光の光路長と参照光の光路長がほぼ同一となったときに干渉を生じる。参照ミラー２２１は、撮影制御部１０１－２により制御される不図示のモータ及び駆動機構によって光軸方向に調整可能に保持され、測定光の光路長に参照光の光路長を合わせることが可能である。干渉光は光ファイバー２２７を介して分光器２３０に導かれる。 The measurement light and the reference light are combined by the optical coupler 219 to become interference light. Here, the measurement light and the reference light interfere with each other when the optical path length of the measurement light and the optical path length of the reference light are substantially the same. The reference mirror 221 is held so as to be adjustable in the optical axis direction by a motor and drive mechanism (not shown) controlled by the imaging control unit 101-2, and the optical path length of the reference light can be matched with the optical path length of the measurement light. be. The interference light is guided to spectroscope 230 via optical fiber 227 .

また、光ファイバー２２４，２２６中には偏光調整部２２８，２２９が設けられている。偏光調整部２２８，２２９は、それぞれ測定光及び参照光の偏光調整を行う。偏光調整部２２８，２２９は、光ファイバーをループ状に引きまわした部分を幾つか持っている。偏光調整部２２８，２２９では、このループ状の部分を光ファイバーの長手方向を中心として回転させることで光ファイバーに捩じりを加え、測定光と参照光の偏光状態を各々調整して合わせることができる。 Polarization adjusters 228 and 229 are provided in the optical fibers 224 and 226, respectively. Polarization adjusters 228 and 229 adjust the polarization of the measurement light and the reference light, respectively. The polarization adjusters 228 and 229 have several looped portions of optical fibers. In the polarization adjustment units 228 and 229, the loop-shaped portions are rotated around the longitudinal direction of the optical fiber to twist the optical fiber, thereby adjusting and matching the polarization states of the measurement light and the reference light. .

分光器２３０には、レンズ２３２，２３４、回折格子２３３、及びラインセンサ２３１が設けられている。光ファイバー２２７から出射された干渉光は、レンズ２３４を介して平行光となった後、回折格子２３３で分光され、レンズ２３２によってラインセンサ２３１に結像される。 The spectroscope 230 is provided with lenses 232 and 234 , a diffraction grating 233 and a line sensor 231 . The interference light emitted from the optical fiber 227 becomes parallel light through the lens 234 , is dispersed by the diffraction grating 233 , and is imaged on the line sensor 231 by the lens 232 .

干渉光は、ラインセンサ２３１によって、波長毎の強度情報として計測される。ラインセンサ２３１によって計測された波長毎の強度情報は、画像処理装置１０１に断層像の信号データ（干渉信号）として出力される。画像処理装置１０１は、受け取った信号データを用いて被検眼２００の断層画像を生成することができる。 The interference light is measured by the line sensor 231 as intensity information for each wavelength. The intensity information for each wavelength measured by the line sensor 231 is output to the image processing apparatus 101 as tomographic image signal data (interference signal). The image processing apparatus 101 can generate a tomographic image of the subject's eye 200 using the received signal data.

本実施形態では、干渉計としてマイケルソン干渉計を用いたが、マッハツェンダー干渉計を用いてもよい。例えば、測定光と参照光との光量差に応じて、光量差が大きい場合にはマッハツェンダー干渉計を、光量差が比較的小さい場合にはマイケルソン干渉計を用いることができる。 In this embodiment, a Michelson interferometer is used as an interferometer, but a Mach-Zehnder interferometer may be used. For example, depending on the light amount difference between the measurement light and the reference light, a Mach-Zehnder interferometer can be used when the light amount difference is large, and a Michelson interferometer can be used when the light amount difference is relatively small.

（高画質化処理）
以下、具体的にＯＣＴで取得したモーションコントラストデータを対象とする高画質化処理について説明するが、実施形態の説明の中で使用する用語について簡単に定義しておく。まず、干渉信号やこれに基づく輝度の断層画像に関する三次元ボリュームデータの情報をＯＣＴデータ、モーションコントラストに関する三次元ボリュームデータの情報をＯＣＴＡデータと表記する。また、干渉信号やこれに基づく輝度の断層画像に関する三次元ボリュームデータから取り出せる二次元情報をＯＣＴ画像、モーションコントラストに関する三次元ボリュームデータから取り出せる二次元情報をＯＣＴＡ画像とする。特に、指定した深さ方向の範囲で、干渉信号やこれに基づく輝度の断層画像に関する三次元ボリュームデータを投影又は積算して生成した画像をＯＣＴ正面画像（輝度Ｅｎ－Ｆａｃｅ）と表記する。また、モーションコントラストに関する三次元ボリュームデータを投影又は積算して生成した画像をＯＣＴＡ正面画像と表記する。さらに、深さ方向のデータを含む二次元画像を断層画像と表記する。 (High image quality processing)
The image quality enhancement processing for motion contrast data acquired by OCT will be specifically described below, but the terms used in the description of the embodiments will be briefly defined. First, three-dimensional volume data information relating to interference signals and brightness tomographic images based thereon is referred to as OCT data, and three-dimensional volume data information relating to motion contrast is referred to as OCTA data. An OCT image is two-dimensional information that can be extracted from three-dimensional volume data on a tomographic image of brightness based on an interference signal, and an OCTA image is two-dimensional information that can be extracted from three-dimensional volume data on motion contrast. In particular, an image generated by projecting or accumulating three-dimensional volume data relating to an interference signal or a luminance tomographic image based thereon within a specified range in the depth direction is referred to as an OCT front image (luminance En-Face). An image generated by projecting or integrating three-dimensional volume data relating to motion contrast is referred to as an OCTA front image. Furthermore, a two-dimensional image including data in the depth direction is referred to as a tomographic image.

（高画質化部）
図３は、高画質化部１０１－４７のより詳細な構成を示すブロック図である。高画質化部１０１－４７には、高画質化処理部３０１、領域検出部３０２、ＲＯＩ設定部３０３、ブレンド処理部３０４、及びＢＣ調整部３０５が設けられている。 (High image quality part)
FIG. 3 is a block diagram showing a more detailed configuration of the image quality enhancement unit 101-47. The image quality improvement unit 101-47 includes an image quality improvement processing unit 301, an area detection unit 302, an ROI setting unit 303, a blend processing unit 304, and a BC adjustment unit 305. FIG.

高画質化処理部３０１は、取得部１０１－１によって取得された医用画像について、後述する高画質化用の学習済モデル（高画質化エンジン、高画質化モデル）を用いて高画質化処理を行い、高画質画像を生成する。高画質化処理部３０１は、被検体の医用画像を学習データとした学習により得た学習済モデルを用いて、被検体の医用画像（第１の画像）に対して高画質化処理を行い、被検体の高画質画像（第２の画像）を取得する高画質化部の一例として機能することができる。なお、高画質化処理する医用画像は、ＯＣＴ画像、ＯＣＴＡ画像、断層画像、ＯＣＴ正面画像、ＯＣＴＡ正面画像、ＳＬＯ画像、及び前眼観察画像及びこれら画像を解析して得た解析マップ等の画像であってよい。 The image quality enhancement processing unit 301 performs image quality enhancement processing on the medical image acquired by the acquisition unit 101-1 using a trained model for image quality enhancement (image quality enhancement engine, image quality enhancement model) described later. to produce high-quality images. The image quality enhancement processing unit 301 performs image quality enhancement processing on the medical image of the subject (first image) using a trained model obtained by learning using the medical image of the subject as learning data, It can function as an example of an image quality enhancing unit that acquires a high quality image (second image) of the subject. The medical images to be processed for high image quality include OCT images, OCTA images, tomographic images, OCT front images, OCTA front images, SLO images, anterior observation images, and images such as analysis maps obtained by analyzing these images. can be

領域検出部３０２は、後述する領域検出用の学習済モデル（領域検出エンジン、領域検出モデル）を用いて医用画像又は高画質化された医用画像に対して領域検出処理を行い、少なくとも１つの領域（例えば、対象領域）を検出する。例えば、領域検出部３０２は、対象領域及び対象領域以外の領域を検出することができる。領域検出部３０２は、被検体の医用画像又は高画質画像における対象領域を検出する検出部の一例として機能することができる。例えば、領域検出部３０２は、領域検出エンジンを用いてＯＣＴＡ正面画像から灌流領域と無灌流領域を検出する。眼底の診断においては、灌流領域と無灌流領域を特定することで、血管があるべきところに血流がない、あるいは血管がないはずのところに何らかの血流が認められるか（新生血管など）を判断することができる。なお、対象領域は、灌流領域及び無灌流領域に限られず、高画質化エンジンにより過補正が生じる領域を含んだものであればよい。例えば、対象領域は、灌流領域及び無灌流領域のほかに中心窩無血管領域や視神経乳頭部等を含んでもよい。 The region detection unit 302 performs region detection processing on a medical image or a high-quality medical image using a trained model for region detection (region detection engine, region detection model) described later, and detects at least one region. (eg, region of interest). For example, the area detection unit 302 can detect the target area and areas other than the target area. The region detection unit 302 can function as an example of a detection unit that detects a target region in a medical image or a high-quality image of a subject. For example, the region detection unit 302 detects perfused regions and non-perfused regions from the OCTA frontal image using a region detection engine. In diagnosing the fundus, by identifying perfused and non-perfused areas, it is possible to determine whether there is no blood flow where there should be blood vessels, or whether there is some blood flow where there should be no blood vessels (new blood vessels, etc.). can judge. Note that the target region is not limited to the perfusion region and non-perfusion region, and may include any region in which overcorrection occurs by the image quality enhancement engine. For example, the regions of interest may include perfused and non-perfused regions, as well as foveal avascular regions, optic discs, and the like.

なお、領域の分類は、それぞれの領域を示す属性情報（ラベル）で管理すればよい。例えば、領域検出部３０２は、医用画像の各画素位置に対応する画素値として属性情報を有するラベル画像を生成してもよい。また、属性情報に加えて、領域検出エンジンから出力されたそれぞれの属性情報に関する確からしさ（信頼度、確率）を示す値を医用画像の各画素に対応付けて保持してもよい。 Note that the classification of areas may be managed by attribute information (label) indicating each area. For example, the region detection unit 302 may generate a label image having attribute information as pixel values corresponding to each pixel position of the medical image. In addition to the attribute information, a value indicating the likelihood (reliability, probability) of each piece of attribute information output from the region detection engine may be stored in association with each pixel of the medical image.

ＲＯＩ設定部３０３は、領域検出部３０２で検出された領域に基づいてＲＯＩ（ＲｅｇｉｏｎＯｆＩｎｔｅｒｅｓｔ）を設定する。なお、ＲＯＩは、領域検出部３０２で検出された領域（対象領域）のうちの少なくとも１つの領域とすることができる。なお、ＲＯＩは、領域検出部３０２によって所定の属性情報を有する領域として自動的に設定されてもよく、この場合にはＲＯＩ設定部３０３は省略されてよい。 The ROI setting unit 303 sets a ROI (Region Of Interest) based on the area detected by the area detection unit 302 . Note that the ROI can be at least one of the regions (target regions) detected by the region detection unit 302 . Note that the ROI may be automatically set as a region having predetermined attribute information by the region detection unit 302, in which case the ROI setting unit 303 may be omitted.

ブレンド処理部３０４は、例えば、所定の属性情報を有する領域である設定されたＲＯＩに対して、医用画像及び医用画像の高画質画像を用いて画像処理の一種であるブレンド処理を行う。ブレンド処理としては、例えば、αブレンド処理等の公知の任意の処理を用いてよい。ブレンド処理部３０４は、医用画像又は高画質画像における対象領域に対して、対象領域の画素値と対象領域以外の領域の画素値との差が広がるように且つ対象領域の画素値がより低くなるように画像処理を行う画像処理部の一例として機能することができる。 The blend processing unit 304 performs blend processing, which is a type of image processing, using a medical image and a high-quality image of the medical image, for example, on a set ROI, which is an area having predetermined attribute information. As the blending process, for example, any known process such as α-blending process may be used. The blend processing unit 304 is applied to a target region in a medical image or a high-quality image so that the difference between the pixel values of the target region and the pixel values of regions other than the target region widens and the pixel values of the target region become lower. It can function as an example of an image processing unit that performs image processing as described above.

ＢＣ調整部３０５は、例えば、所定の属性情報を有する領域である設定されたＲＯＩに対して、明るさ及びコントラストの少なくとも一方の補正処理であるＢＣ（Ｂｒｉｇｈｔｎｅｓｓ，Ｃｏｎｔｒａｓｔ）調整処理を行う。ここで、コントラストの補正処理とは、公知の任意のコントラスト補正処理を含んでよく、例えば、トーンカーブ又はガンマカーブを用いた補正やレベル補正を含んでよい。ＢＣ調整部３０５も、医用画像又は高画質画像における対象領域に対して、対象領域の画素値と対象領域以外の領域の画素値との差が広がるように且つ対象領域の画素値がより低くなるように画像処理を行う画像処理部の一例として機能することができる。 The BC adjustment unit 305 performs, for example, BC (Brightness, Contrast) adjustment processing, which is correction processing for at least one of brightness and contrast, on a set ROI, which is an area having predetermined attribute information. Here, the contrast correction processing may include any known contrast correction processing, and may include, for example, correction using a tone curve or gamma curve and level correction. The BC adjustment unit 305 also adjusts the pixel values of the target region in the medical image or the high-quality image such that the difference between the pixel values of the target region and the pixel values of the regions other than the target region widens and the pixel values of the target region become lower. It can function as an example of an image processing unit that performs image processing as described above.

なお、ブレンド処理部３０４及びＢＣ調整部３０５はいずれか一方のみが設けられてもよい。また、ブレンド処理部３０４及びＢＣ調整部３０５の両方が設けられる場合には、画像処理装置１０１は、操作者による指示に応じて、実行するべき画像処理としてブレンド処理及びＢＣ調整処理のいずれか又は両方を選択できるように構成されてもよい。なお、ブレンド処理及びＢＣ調整処理の両方を実行する場合には、画像処理装置１０１は、それぞれの画像処理を施した画像をＵＩ等の操作に応じて切り替えて又は並べて出力部１０３に表示させるように構成されてもよい。 Only one of the blend processing unit 304 and the BC adjustment unit 305 may be provided. Further, when both the blend processing unit 304 and the BC adjustment unit 305 are provided, the image processing apparatus 101 performs either blend processing or BC adjustment processing as image processing to be executed in accordance with an instruction from the operator. It may be configured so that both can be selected. Note that when both the blending process and the BC adjustment process are executed, the image processing apparatus 101 causes the output unit 103 to display the images subjected to the respective image processing by switching or arranging them in accordance with an operation such as a UI. may be configured to

（医用画像の高画質化エンジン）
以下、高画質化処理部３０１が用いる高画質化エンジンについて図４乃至図７を参照して説明する。なお、本実施形態では高画質化エンジンは記憶部１０１－３に記憶される構成とするが、高画質化エンジンは、画像処理装置１０１に接続される外部装置に設けられてもよい。この場合、高画質化処理部３０１は、外部装置に設けられた高画質化エンジンを用いて高画質化処理を行うことができる。 (Medical image quality enhancement engine)
The image quality enhancement engine used by the image quality enhancement processing unit 301 will be described below with reference to FIGS. 4 to 7. FIG. Although the image quality improvement engine is stored in the storage unit 101 - 3 in this embodiment, the image quality improvement engine may be provided in an external device connected to the image processing apparatus 101 . In this case, the image quality enhancement processing unit 301 can perform image quality enhancement processing using an image quality enhancement engine provided in the external device.

本実施形態に係る高画質化エンジンは、機械学習アルゴリズムに係るトレーニング（学習）を行って得た学習済モデルである。本実施形態では、機械学習アルゴリズムに係る機械学習モデルのトレーニングに、処理対象として想定される特定の撮影条件を持つ低画質画像である入力データと、入力データに対応する高画質画像である出力データのペア群で構成された学習データを用いる。なお、特定の撮影条件には、具体的には、予め決定された撮影部位、撮影方式、撮影画角、及び画像サイズ等が含まれる。 The image quality enhancement engine according to this embodiment is a learned model obtained by performing training (learning) related to a machine learning algorithm. In this embodiment, for training a machine learning model related to a machine learning algorithm, input data that is a low-quality image having specific shooting conditions assumed to be processed, and output data that is a high-quality image corresponding to the input data. We use learning data composed of pairs of The specific imaging conditions specifically include a predetermined imaging region, imaging method, imaging angle of view, image size, and the like.

ここで、一般的な学習済モデルについて簡単に説明する。学習済モデルとは、任意の機械学習アルゴリズムに対して、事前に適切な学習データを用いてトレーニング（学習）を行った機械学習モデルである。学習データは、一つ以上の、入力データと出力データ（正解データ）とのペア群で構成される。なお、学習データを構成するペア群の入力データと出力データの形式や組み合わせは、一方が画像で他方が数値であったり、一方が複数の画像群で構成され他方が文字列であったり、双方が画像であったりする等、所望の構成に適したものであってよい。 Here, a general trained model will be briefly described. A trained model is a machine learning model that has been trained (learned) in advance using appropriate learning data for any machine learning algorithm. The learning data is composed of one or more pairs of input data and output data (correct data). The format and combination of the input data and the output data of the paired groups that make up the learning data may be one that is an image and the other that is a numerical value, one that is composed of a plurality of image groups and the other that is a character string, or both. may be suitable for a desired configuration, such as an image.

具体的には、例えば、ＯＣＴによって取得された画像と、該画像に対応する撮影部位ラベルとのペア群によって構成された学習データ（以下、第１の学習データ）が挙げられる。なお、撮影部位ラベルは部位を表すユニークな数値や文字列である。また、その他の学習データの例として、ＯＣＴの通常撮影によって取得されたノイズの多い低画質画像と、ＯＣＴにより複数回撮影して高画質化処理した高画質画像とのペア群によって構成されている学習データ（以下、第２の学習データ）等が挙げられる。 Specifically, for example, learning data (hereinafter referred to as first learning data) configured by a pair group of an image obtained by OCT and an imaging site label corresponding to the image can be used. Note that the imaging region label is a unique numerical value or character string representing the region. Another example of the learning data is a pair group of a low-quality image with a lot of noise obtained by normal OCT imaging and a high-quality image obtained by imaging multiple times by OCT and processing to improve the image quality. learning data (hereinafter referred to as second learning data) and the like.

このとき、学習済モデルに入力データを入力すると、該学習済モデルの設計に従った出力データが出力される。学習済モデルは、例えば、学習データを用いてトレーニングされた傾向に従って、入力データに対応する可能性の高い出力データを出力する。また、学習済モデルは、例えば、学習データを用いてトレーニングされた傾向に従って、出力データの種類のそれぞれについて、入力データに対応する確からしさ（信頼度、確率）を数値として出力する等を行うことができる。 At this time, when input data is input to the trained model, output data according to the design of the trained model is output. A trained model outputs output data that is likely to correspond to input data, for example, according to a tendency trained using learning data. In addition, the learned model, for example, according to the tendency trained using the learning data, outputs the certainty (reliability, probability) corresponding to the input data as a numerical value for each type of output data. can be done.

具体的には、例えば、第１の学習データでトレーニングされた機械学習モデルにＯＣＴによって取得された画像を入力すると、機械学習モデルは、該画像に撮影されている撮影部位の撮影部位ラベルを出力したり、撮影部位ラベル毎の確率を出力したりする。また、例えば、第２の学習データでトレーニングされた機械学習モデルにＯＣＴの通常撮影によって取得されたノイズの多い低画質画像を入力すると、機械学習モデルは、ＯＣＴにより複数回撮影して高画質化処理された画像相当の高画質画像を出力する。なお、機械学習モデルについては、品質保持の観点から、自身が出力した出力データを学習データとして用いないように構成することもできる。 Specifically, for example, when an image acquired by OCT is input to a machine learning model trained with the first learning data, the machine learning model outputs the imaging part label of the imaging part photographed in the image. or output the probability for each imaging site label. Further, for example, when a low-quality image with a lot of noise obtained by normal OCT imaging is input to the machine learning model trained with the second learning data, the machine learning model is imaged multiple times by OCT to improve the image quality. A high-quality image equivalent to the processed image is output. From the viewpoint of quality maintenance, the machine learning model may be configured not to use its own output data as learning data.

また、機械学習アルゴリズムは、畳み込みニューラルネットワーク（ＣＮＮ：ＣｏｎｖｏｌｕｔｉｏｎＮｅｕｒａｌＮｅｔｗｏｒｋ）等のディープラーニングに関する手法を含む。ディープラーニングに関する手法においては、ニューラルネットワークを構成する層群やノード群に対するパラメータの設定が異なると、学習データを用いてトレーニングされた傾向を出力データに再現可能な程度が異なる場合がある。例えば、第１の学習データを用いたディープラーニングの機械学習モデルにおいては、より適切なパラメータが設定されていると、正しい撮影部位ラベルを出力する確率がより高くなる場合がある。また、例えば、第２の学習データを用いたディープラーニングの機械学習モデルにおいては、より適切なパラメータが設定されていると、より高画質な画像を出力できる場合がある。 Machine learning algorithms also include techniques related to deep learning such as convolutional neural networks (CNNs). In methods related to deep learning, if the parameter settings for the layers and nodes that make up the neural network are different, the degree of reproducibility of the tendencies trained using the learning data in the output data may differ. For example, in a deep learning machine learning model that uses the first learning data, setting more appropriate parameters may increase the probability of outputting a correct imaging region label. Further, for example, in a deep learning machine learning model using the second learning data, if more appropriate parameters are set, it may be possible to output a higher quality image.

具体的には、ＣＮＮにおけるパラメータは、例えば、畳み込み層に対して設定される、フィルタのカーネルサイズ、フィルタの数、ストライドの値、及びダイレーションの値、並びに全結合層の出力するノードの数等を含むことができる。なお、パラメータ群やトレーニングのエポック数は、学習データに基づいて、学習済モデルの利用形態に好ましい値に設定することができる。例えば、学習データに基づいて、正しい撮影部位ラベルをより高い確率で出力したり、より高画質な画像を出力したりできるパラメータ群やエポック数を設定することができる。 Specifically, the parameters in the CNN are, for example, the kernel size of the filter, the number of filters, the value of the stride, the value of the dilation, and the number of nodes output by the fully connected layer, which are set for the convolution layer etc. Note that the parameter group and the number of training epochs can be set to values that are preferable for the mode of use of the trained model based on learning data. For example, based on the learning data, it is possible to set a parameter group and the number of epochs that can output a correct imaging site label with a higher probability or output a higher quality image.

このようなパラメータ群やエポック数の決定方法の一つを例示する。まず、学習データを構成するペア群の７割をトレーニング用とし、残りの３割を評価用としてランダムに設定する。次に、トレーニング用のペア群を用いて機械学習モデルのトレーニングを行い、トレーニングの各エポックの終了時に、評価用のペア群を用いてトレーニング評価値を算出する。トレーニング評価値とは、例えば、各ペアを構成する入力データをトレーニング中の機械学習モデルに入力したときの出力と、入力データに対応する出力データとを損失関数によって評価した値群の平均値である。最後に、最もトレーニング評価値が小さくなったときのパラメータ群及びエポック数を、当該機械学習モデルのパラメータ群やエポック数として決定する。なお、このように、学習データを構成するペア群をトレーニング用と評価用とに分けてエポック数の決定を行うことによって、機械学習モデルがトレーニング用のペア群に対して過学習してしまうことを防ぐことができる。 One method for determining such a parameter group and the number of epochs will be exemplified. First, 70% of the pairs constituting learning data are randomly set for training and the remaining 30% for evaluation. Next, the training pair group is used to train the machine learning model, and at the end of each epoch of training, the evaluation pair group is used to calculate a training evaluation value. The training evaluation value is, for example, the average value of a group of values obtained by evaluating the output when inputting each pair of input data into the machine learning model under training and the output data corresponding to the input data using a loss function. be. Finally, the parameter group and epoch number when the training evaluation value is the smallest are determined as the parameter group and epoch number of the machine learning model. It should be noted that the machine learning model may overfit the pair group for training by dividing the pair group that constitutes the learning data into the pair group for training and the pair group for evaluation in this way and determining the number of epochs. can be prevented.

ここで、本実施形態に係る高画質化エンジンは、入力された低画質画像を高画質化した高画質画像を出力するモジュールとして構成される。ここで、本明細書における高画質化とは、入力された画像を画像診断により適した画質の画像に変換することをいい、高画質画像とは、画像診断により適した画質の画像に変換された画像をいう。また、低画質画像とは、例えば、Ｘ線撮影、ＣＴ、ＭＲＩ、ＯＣＴ、ＰＥＴ、若しくはＳＰＥＣＴ等により取得された二次元画像や三次元画像、又は連続撮影したＣＴの三次元動画像等の特に高画質になるような設定をされずに撮影されたものである。具体的には、低画質画像は、例えば、Ｘ線撮影装置やＣＴによる低線量での撮影や、造影剤を使用しないＭＲＩによる撮影、ＯＣＴの短時間撮影等によって取得される画像、及び少ない撮影回数で取得されたＯＣＴＡ画像等を含む。 Here, the image quality enhancement engine according to the present embodiment is configured as a module for outputting a high quality image obtained by enhancing the image quality of an input low quality image. Here, the term “improvement in image quality” as used herein refers to conversion of an input image into an image having a quality more suitable for image diagnosis. image. In addition, a low-quality image is, for example, a two-dimensional image or a three-dimensional image obtained by X-ray imaging, CT, MRI, OCT, PET, or SPECT, or a three-dimensional moving image of continuous CT. It was taken without any settings for high image quality. Specifically, low-quality images include, for example, low-dose imaging by X-ray imaging equipment or CT, imaging by MRI that does not use a contrast agent, images acquired by short-time imaging of OCT, etc., and images with a small amount of imaging. Including OCTA images etc. acquired at the number of times.

また、ノイズが少なかったり、高コントラストであったりする高画質画像を、ＯＣＴＡ等の画像の血管解析処理や、ＣＴやＯＣＴ等の画像の領域セグメンテーション処理等の画像解析に利用すると、低画質画像を利用するよりも精度よく解析が行えることが多い。そのため、高画質化エンジンによって出力された高画質画像は、画像診断だけでなく、画像解析にも有用である場合がある。 In addition, if high-quality images with little noise or high contrast are used for image analysis such as blood vessel analysis processing of images such as OCTA, and region segmentation processing of images such as CT and OCT, low-quality images can be used. In many cases, analysis can be performed with higher accuracy than using it. Therefore, the high-quality image output by the image quality enhancement engine may be useful not only for image diagnosis but also for image analysis.

また、画像診断に適した画質の内容は、各種の画像診断で何を診断したいのかということに依存する。そのため一概には言えないが、例えば、画像診断に適した画質は、ノイズが少なかったり、高コントラストであったり、撮影対象を観察しやすい色や階調で示していたり、画像サイズが大きかったり、高解像度であったりする画質を含む。また、画像生成の過程で描画されてしまった実際には存在しないオブジェクトやグラデーションが画像から除去されているような画質を含むことができる。 Further, the content of image quality suitable for image diagnosis depends on what is desired to be diagnosed in various image diagnoses. For this reason, it cannot be said unconditionally, but for example, the image quality suitable for image diagnosis has little noise, high contrast, shows the object to be photographed in colors and gradations that are easy to observe, has a large image size, etc. Includes image quality, which may be high resolution. In addition, it is possible to include an image quality in which non-existent objects and gradations that have been drawn in the process of image generation are removed from the image.

本実施形態における高画質化処理部３０１による高画質化手法を構成する画像処理手法では、ディープラーニング等の各種機械学習アルゴリズムを用いた処理を行う。なお、当該画像処理手法では、機械学習アルゴリズムを用いた処理に加えて、各種画像フィルタ処理、類似画像に対応する高画質画像のデータベースを用いたマッチング処理、及び知識ベース画像処理等の既存の任意の処理を行ってもよい。 In the image processing method constituting the image quality improvement method by the image quality improvement processing unit 301 in this embodiment, processing using various machine learning algorithms such as deep learning is performed. In addition to processing using machine learning algorithms, this image processing method includes various image filter processing, matching processing using a database of high-quality images corresponding to similar images, and existing arbitrary image processing such as knowledge-based image processing. may be processed.

以下、図４を参照して、本実施形態に係る高画質化エンジンに係るＣＮＮの構成例を説明する。図４は、高画質化エンジンの構成の一例を示している。図４で示す構成は、入力値群を加工して出力する処理を担う、複数の層群によって構成される。なお、当該構成に含まれる層の種類としては、図４に示すように、畳み込み（Ｃｏｎｖｏｌｕｔｉｏｎ）層、ダウンサンプリング（Ｄｏｗｎｓａｍｐｌｉｎｇ）層、アップサンプリング（Ｕｐｓａｍｐｌｉｎｇ）層、及び合成（Ｍｅｒｇｅｒ）層がある。 A configuration example of the CNN related to the image quality enhancement engine according to the present embodiment will be described below with reference to FIG. FIG. 4 shows an example of the configuration of the image quality enhancement engine. The configuration shown in FIG. 4 is composed of a plurality of layer groups that are responsible for processing and outputting an input value group. As shown in FIG. 4, types of layers included in the configuration include a convolution layer, a downsampling layer, an upsampling layer, and a merger layer.

畳み込み層は、設定されたフィルタのカーネルサイズ、フィルタの数、ストライドの値、及びダイレーションの値等のパラメータに従い、入力値群に対して畳み込み処理を行う層である。なお、入力される画像の次元数に応じて、フィルタのカーネルサイズの次元数も変更してもよい。ダウンサンプリング層は、入力値群を間引いたり、合成したりすることによって、出力値群の数を入力値群の数よりも少なくする処理である。具体的には、例えば、ＭａｘＰｏｏｌｉｎｇ処理がある。アップサンプリング層は、入力値群を複製したり、入力値群から補間した値を追加したりすることによって、出力値群の数を入力値群の数よりも多くする処理である。具体的には、例えば、線形補間処理がある。合成層は、ある層の出力値群や画像を構成する画素値群といった値群を、複数のソースから入力し、それらを連結したり、加算したりして合成する処理を行う層である。 The convolution layer is a layer that performs convolution processing on an input value group according to set parameters such as the kernel size of filters, the number of filters, the stride value, and the dilation value. Note that the number of dimensions of the kernel size of the filter may also be changed according to the number of dimensions of the input image. A down-sampling layer is a process that reduces the number of output value groups to the number of input value groups by thinning out or synthesizing input value groups. Specifically, for example, there is a Max Pooling process. The upsampling layer is a process that makes the number of output value groups larger than the number of input value groups by duplicating the input value groups or adding values interpolated from the input value groups. Specifically, for example, there is linear interpolation processing. The synthesizing layer is a layer that performs a process of synthesizing a group of values such as a group of output values of a certain layer and a group of pixel values forming an image from a plurality of sources and connecting or adding them.

このような構成では、入力された画像Ｉｍ４１０を構成する画素値群が畳み込み処理ブロックを経て出力された値群と、入力された画像Ｉｍ４１０を構成する画素値群が、合成層で合成される。その後、合成された画素値群は最後の畳み込み層で高画質画像Ｉｍ４２０に成形される。なお、図示はしないが、ＣＮＮの構成の変更例として、例えば、畳み込み層の後にバッチ正規化（ＢａｔｃｈＮｏｒｍａｌｉｚａｔｉｏｎ）層や、正規化線形関数（ＲｅｃｔｉｆｉｅｒＬｉｎｅａｒＵｎｉｔ）を用いた活性化層を組み込む等をしてもよい。 In such a configuration, the value group output from the pixel value group forming the input image Im410 through the convolution processing block and the pixel value group forming the input image Im410 are synthesized in the synthesis layer. The combined pixel values are then shaped into a high quality image Im420 in a final convolutional layer. Although not shown in the figure, as an example of changing the configuration of the CNN, for example, a batch normalization layer after the convolution layer or an activation layer using a normalized linear function (Rectifier Linear Unit) is incorporated. You may

ここで、ＧＰＵは、データをより多く並列処理することで効率的な演算を行うことができる。このため、ディープラーニングのような学習モデルを用いて複数回に渡り学習を行う場合には、ＧＰＵで処理を行うことが有効である。そこで、本実施形態では、学習部の一例である画像処理部１０１－４による処理には、ＣＰＵに加えてＧＰＵを用いる。具体的には、学習モデルを含む学習プログラムを実行する場合に、ＣＰＵとＧＰＵが協働して演算を行うことで学習を行う。なお、学習部の処理では、ＣＰＵ又はＧＰＵのみにより演算が行われてもよい。また、高画質化処理部３０１についても、学習部と同様にＧＰＵを用いて実現してもよい。 Here, the GPU can perform efficient operations by processing more data in parallel. Therefore, when learning is performed multiple times using a learning model such as deep learning, it is effective to perform processing using a GPU. Therefore, in this embodiment, the GPU is used in addition to the CPU for processing by the image processing unit 101-4, which is an example of the learning unit. Specifically, when a learning program including a learning model is executed, the CPU and the GPU cooperate to perform calculations for learning. In addition, in the processing of the learning unit, the calculation may be performed only by the CPU or the GPU. Also, the image quality enhancement processing unit 301 may be implemented using a GPU, like the learning unit.

また、学習部は、不図示の誤差検出部と更新部とを備えてもよい。誤差検出部は、入力層に入力される入力データに応じてニューラルネットワークの出力層から出力される出力データと、正解データとの誤差を得る。誤差検出部は、損失関数を用いて、ニューラルネットワークからの出力データと正解データとの誤差を計算するようにしてもよい。また、更新部は、誤差検出部で得られた誤差に基づいて、その誤差が小さくなるように、ニューラルネットワークのノード間の結合重み付け係数等を更新する。この更新部は、例えば、誤差逆伝播法を用いて、結合重み付け係数等を更新する。誤差逆伝播法は、上記の誤差が小さくなるように、各ニューラルネットワークのノード間の結合重み付け係数等を調整する手法である。 Also, the learning unit may include an error detection unit and an updating unit (not shown). The error detection unit obtains an error between correct data and output data output from the output layer of the neural network according to input data input to the input layer. The error detector may use a loss function to calculate the error between the output data from the neural network and the correct data. Also, the updating unit updates the weighting coefficients for coupling between nodes of the neural network based on the error obtained by the error detecting unit so as to reduce the error. This updating unit updates the connection weighting coefficients and the like using, for example, the error backpropagation method. The error backpropagation method is a method of adjusting the connection weighting coefficients and the like between nodes of each neural network so as to reduce the above error.

なお、ＣＮＮを用いた画像処理等、一部の画像処理手法を利用する場合には画像サイズについて注意する必要がある。具体的には、高画質画像の周辺部が十分に高画質化されない問題等の対策のため、入力する低画質画像と出力する高画質画像とで異なる画像サイズを要する場合があることに留意すべきである。 Note that when using some image processing methods such as image processing using CNN, it is necessary to pay attention to the image size. Specifically, it should be noted that different image sizes may be required for the input low-quality image and the output high-quality image in order to deal with the problem that the peripheral part of the high-quality image is not sufficiently high-quality. should.

明瞭な説明のため、本実施形態において明記はしないが、高画質化エンジンに入力される画像と出力される画像とで異なる画像サイズを要する高画質化エンジンを採用した場合には、適宜画像サイズを調整しているものとする。具体的には、機械学習モデルをトレーニングするための学習データに用いる画像や、高画質化エンジンに入力される画像といった入力画像に対して、パディングを行ったり、該入力画像の周辺の撮影領域を結合したりして、画像サイズを調整する。なお、パディングを行う領域は、効果的に高画質化できるように高画質化手法の特性に合わせて、一定の画素値で埋めたり、近傍画素値で埋めたり、ミラーパディングしたりする。 Although not explicitly stated in this embodiment for the sake of clarity, if an image quality enhancement engine that requires different image sizes for an image input to the image quality enhancement engine and an image output from the image quality enhancement engine is adopted, the image size may be changed as appropriate. is adjusted. Specifically, for input images such as images used as learning data for training machine learning models and images input to the image quality improvement engine, padding is performed, and the shooting area around the input image is adjusted. Combine and adjust the image size. Note that the area to be padded is filled with a constant pixel value, filled with neighboring pixel values, or mirror-padding in accordance with the characteristics of the image quality improvement method so as to effectively improve the image quality.

また、高画質化処理部３０１による高画質化手法は、一つの画像処理手法だけで実施されてもよいし、二つ以上の画像処理手法を組み合わせて実施されてもよい。また、複数の高画質化手法群を並列に実施し、複数の高画質画像群を生成した上で、最も高画質な高画質画像を最終的に高画質画像として選択してもよい。なお、最も高画質な高画質画像の選択は、画質評価指数を用いて自動的に行われてもよいし、出力部１０３等に備えられたＵＩに複数の高画質画像群を表示して、検者（操作者）の指示に応じて行われてもよい。 Further, the image quality enhancement method by the image quality enhancement processing unit 301 may be implemented using only one image processing method, or may be implemented by combining two or more image processing methods. Alternatively, a plurality of image quality improvement technique groups may be performed in parallel to generate a plurality of high quality image groups, and then the highest quality image may be finally selected as the high quality image. The selection of the high-quality image with the highest image quality may be performed automatically using the image quality evaluation index, or a plurality of high-quality image groups may be displayed on a UI provided in the output unit 103 or the like, It may be performed according to an instruction from an examiner (operator).

なお、高画質化していない入力画像の方が、画像診断に適している場合もあるので、最終的な画像の選択の対象には入力画像を加えてよい。また、高画質化エンジンに対して、低画質画像とともにパラメータを入力してもよい。高画質化エンジンに対して、入力画像とともに、例えば、高画質化を行う程度を指定するパラメータや、画像処理手法に用いられる画像フィルタサイズを指定するパラメータを入力してもよい。 In some cases, an input image whose image quality has not been enhanced is more suitable for image diagnosis, so the input image may be added to the final selection of images. Also, a parameter may be input to the high image quality engine together with the low image quality image. For example, a parameter designating the degree of image quality improvement or a parameter designating the image filter size used in the image processing technique may be input together with the input image to the image quality improvement engine.

ここで、本実施形態に係る高画質化エンジンの学習データの入力データは、撮影装置１００と同じ機種、撮影装置１００と同じ設定により取得された低画質画像である。また、高画質化エンジンの学習データの出力データは、同じ機種が備えるより工数の多い撮影条件に関する設定や画像処理により取得された高画質画像である。具体的には、出力データは、例えば、複数回撮影することにより取得した画像（元画像）群に対して加算平均等の重ね合わせ処理を行うことにより得られる高画質画像（重ね合わせ画像）とすることができる。 Here, the input data for the learning data of the image quality enhancement engine according to the present embodiment is a low image quality image acquired with the same model as the image capturing apparatus 100 and the same settings as the image capturing apparatus 100 . In addition, the output data of the learning data of the image quality improvement engine is a high quality image obtained by image processing and settings related to shooting conditions that require more man-hours than those of the same model. Specifically, the output data is, for example, a high-quality image (overlay image) obtained by performing superimposition processing such as averaging on a group of images (original images) obtained by photographing multiple times. can do.

ここで、高画質画像と低画質画像についてＯＣＴＡのモーションコントラストデータを例として説明する。モーションコントラストデータとは、ＯＣＴＡ等で用いられる、撮影対象の同一箇所を繰り返し撮影し、その撮影間における撮影対象の時間的な変化を検出したデータである。また、上述のように、算出したモーションコントラストデータ（三次元の医用画像データの一例）のうち、撮影対象の深さ方向における所望の範囲のデータを用いて正面画像を生成することで、ＯＣＴＡのＥｎ－Ｆａｃｅ画像（ＯＣＴＡ正面画像）を生成することができる。なお、以下では略同一位置（略同一箇所）におけるＯＣＴデータを繰り返し撮影する回数のことをＮＯＲ（ＮｕｍｂｅｒＯｆＲｅｐｅａｔ）と呼ぶ。 Here, motion contrast data of OCTA will be used as an example to describe high-quality images and low-quality images. Motion contrast data is data used in OCTA or the like, which is obtained by repeatedly photographing the same part of an object to be photographed and detecting a temporal change in the object to be photographed between shots. Further, as described above, out of the calculated motion contrast data (an example of three-dimensional medical image data), by generating a front image using data in a desired range in the depth direction of the imaging target, the OCTA can be obtained. An En-Face image (OCTA en face image) can be generated. In the following description, the number of times the OCT data is repeatedly captured at substantially the same position (substantially the same location) is referred to as NOR (Number Of Repeat).

本実施形態に係る学習データに関して、重ね合わせ処理による高画質画像と低画質画像の生成例として異なる２種類の方法について図５（ａ）及び図５（ｂ）を参照して説明する。なお、高画質画像と低画質画像の生成方法はこれらに限られず、公知の任意の生成方法を用いてよい。 Regarding learning data according to the present embodiment, two different methods will be described with reference to FIGS. Note that the method of generating the high-quality image and the low-quality image is not limited to these, and any known generating method may be used.

高画質画像と低画質画像の生成例に係る第１の方法について図５（ａ）を参照して説明する。当該第１の方法では、高画質画像の例として、撮影対象の略同一位置を繰り返し撮影したＯＣＴデータから生成するモーションコントラスト画像を用いる。図５（ａ）において、モーションコントラスト画像Ｉｍ５１０は、三次元のモーションコントラスト画像（三次元のモーションコントラストデータ）を示す。また、モーションコントラスト画像Ｉｍ５１１は、三次元のモーションコントラスト画像を構成する二次元のモーションコントラスト画像（二次元のモーションコントラストデータ）を示す。 A first method relating to an example of generating a high-quality image and a low-quality image will be described with reference to FIG. 5(a). The first method uses, as an example of a high-quality image, a motion contrast image generated from OCT data obtained by repeatedly capturing approximately the same position of an object to be captured. In FIG. 5A, a motion contrast image Im510 indicates a three-dimensional motion contrast image (three-dimensional motion contrast data). A motion contrast image Im511 indicates a two-dimensional motion contrast image (two-dimensional motion contrast data) that forms a three-dimensional motion contrast image.

断層画像Ｉｍ５０１－１～Ｉｍ５０１－３は、モーションコントラスト画像Ｉｍ５１１を生成するためのＯＣＴ断層画像（Ｂスキャン画像）を示している。ここで、ＮＯＲは、図５（ａ）においては、断層画像Ｉｍ５０１－１～Ｉｍ５０１－３におけるＯＣＴ断層画像の数に対応し、図の例においてＮＯＲは３である。断層画像Ｉｍ５０１－１～Ｉｍ５０１－３は所定の時間間隔（Δｔ）で撮影される。なお、略同一位置とは被検眼の正面方向（ＸＹ）において、１ラインのことを示し、図５（ａ）においては、モーションコントラスト画像Ｉｍ５１１の位置に相当する。なお、正面方向は、深さ方向に対して交差する方向の一例である。 Tomographic images Im501-1 to Im501-3 represent OCT tomographic images (B-scan images) for generating the motion contrast image Im511. Here, NOR corresponds to the number of OCT tomographic images in tomographic images Im501-1 to Im501-3 in FIG. 5(a), and NOR is 3 in the example shown. The tomographic images Im501-1 to Im501-3 are captured at predetermined time intervals (Δt). Note that the substantially same position indicates one line in the front direction (XY) of the subject's eye, and corresponds to the position of the motion contrast image Im511 in FIG. 5(a). Note that the front direction is an example of a direction intersecting the depth direction.

モーションコントラストデータは時間的な変化を検出したデータであるため、このデータを生成するためには、少なくともＮＯＲは２回とする必要がある。例えば、ＮＯＲが２の場合には、１つのモーションコントラストデータが生成される。ＮＯＲが３の場合には、隣接する時間間隔（１回目と２回目、２回目と３回目）のＯＣＴデータのみを用いてモーションコントラストデータを生成すると、２つのモーションコントラストデータが生成される。離れた時間間隔（１回目と３回目）のＯＣＴデータも用いてモーションコントラストデータを生成する場合には、合計３つのモーションコントラストデータが生成される。すなわち、ＮＯＲを３回、４回、・・・と増やしていくと、略同一位置におけるモーションコントラストのデータ数も増加する。略同一位置を繰り返し撮影して取得した複数のモーションコントラスト画像を位置合わせして加算平均等の重ね合わせ処理をすることで、高画質なモーションコントラスト画像を生成することができる。そのため、高画質なモーションコントラスト画像を生成するために、ＮＯＲを少なくとも３回以上とし、より高画質なモーションコントラスト画像を得るためには例えばＮＯＲを５回以上とすることができる。 Since motion contrast data is data obtained by detecting temporal changes, NOR must be performed at least twice to generate this data. For example, when NOR is 2, one motion contrast data is generated. When NOR is 3, two pieces of motion contrast data are generated when motion contrast data is generated using only OCT data of adjacent time intervals (first and second, second and third). If the motion contrast data is also generated using separate time intervals (first and third) of OCT data, a total of three motion contrast data are generated. That is, when NOR is increased to 3 times, 4 times, . . . , the number of motion contrast data at substantially the same position also increases. A high-quality motion contrast image can be generated by aligning a plurality of motion contrast images obtained by repeatedly photographing substantially the same position and performing superimposition processing such as averaging. Therefore, in order to generate a motion contrast image of high quality, NOR can be performed at least 3 times or more, and in order to obtain a motion contrast image of higher quality, NOR can be performed 5 times or more, for example.

一方、これに対応する低画質画像の例としては、加算平均等の重ね合わせ処理を行う前のモーションコントラスト画像を用いることができる。この場合、低画質画像は、例えば、高画質画像を生成するための加算平均等の重ね合わせ処理を行う際の基準画像とすることができる。重ね合わせ処理を行う際に、基準画像に対して対象画像の位置や形状を変形して位置合わせを行っておけば、基準画像と重ね合わせ処理後の画像とでは空間的な位置ずれがほとんどない。そのため、容易に低画質画像と高画質画像のペアとすることができる。なお、基準画像ではなく位置合わせの画像変形処理を行った対象画像を低画質画像としてもよい。 On the other hand, as an example of a corresponding low image quality image, a motion contrast image before performing superimposition processing such as averaging can be used. In this case, the low-quality image can be used as a reference image for superimposition processing such as averaging for generating a high-quality image. If the position and shape of the target image are deformed with respect to the reference image when performing the superimposition processing, there is almost no spatial positional deviation between the reference image and the image after the superimposition processing. . Therefore, a low quality image and a high quality image can be easily paired. It should be noted that the low-quality image may be a target image that has been subjected to image deformation processing for alignment instead of the reference image.

元画像群（基準画像と対象画像）のそれぞれを入力データ、対応する重ね合わせ画像を出力データとすることで、複数のペア群を生成することができる。例えば、１５の元画像群から１の重ね合わせ画像を得る場合、元画像群のうちの一つ目の元画像と重ね合わせ画像とのペア、元画像群のうちの二つ目の元画像と重ね合わせ画像とのペアを生成することができる。このように、１５の元画像群から１の重ね合わせ画像を得る場合には、元画像群のうちの一つの画像と重ね合わせ画像による１５のペア群が生成可能である。なお、主走査（Ｘ）方向に略同一位置を繰り返し撮影し、それを副走査（Ｙ）方向にずらしながらスキャンをすることで三次元の高画質データを生成することができる。 A plurality of pair groups can be generated by using each of the original image groups (reference image and target image) as input data and the corresponding superimposed image as output data. For example, when obtaining one superimposed image from a group of 15 original images, a pair of the first original image and the superimposed image in the group of original images, the second original image in the group of original images, and A pair of superimposed images can be generated. Thus, when one superimposed image is obtained from a group of 15 original images, 15 pair groups of one image in the group of original images and the superimposed image can be generated. Three-dimensional high-quality data can be generated by repeatedly photographing substantially the same position in the main scanning (X) direction and scanning while shifting it in the sub-scanning (Y) direction.

次に、高画質画像と低画質画像の生成例に係る第２の方法について図５（ｂ）を参照して説明する。当該第２の方法では、撮影対象の略同一領域を複数回撮影したモーションコントラスト画像を重ね合わせ処理することで高画質画像を生成する。なお、略同一領域とは被検眼の正面方向（Ｘ－Ｙ）において、３×３ｍｍや１０×１０ｍｍのような領域のことを示し、撮影対象の略同一領域を複数回撮影することで、断層画像の深さ方向を含めて三次元のモーションコントラスト画像（三次元のモーションコントラストデータ）を取得することができる。同一領域を複数回撮影して重ね合わせ処理を行う際には、１回あたりの撮影を短くするため、ＮＯＲは２回か３回とすることができる。 Next, a second method related to an example of generating a high-quality image and a low-quality image will be described with reference to FIG. 5(b). In the second method, a high-quality image is generated by superimposing motion contrast images obtained by photographing substantially the same region of an object to be photographed a plurality of times. Note that the substantially identical region indicates a region such as 3 × 3 mm or 10 × 10 mm in the front direction (XY) of the eye to be examined. A three-dimensional motion contrast image (three-dimensional motion contrast data) can be acquired including the depth direction of the image. When the same area is photographed a plurality of times and superimposition processing is performed, the NOR can be performed twice or three times in order to shorten the photographing per time.

また、高画質な３次元モーションコントラストデータを生成するために、同一領域の３次元データを少なくとも２つ以上取得する。図５（ｂ）では、複数の三次元モーションコントラスト画像の例を示している。モーションコントラスト画像Ｉｍ５２０～Ｉｍ５４０は、図５（ａ）で説明したモーションコントラスト画像Ｉｍ５１０と同様に、三次元のモーションコントラスト画像である。これら２つ以上の三次元モーションコントラスト画像を用いて、正面方向（Ｘ－Ｙ）と深度方向（Ｚ）の位置合わせ処理を行い、それぞれのデータにおいてアーチファクトとなるデータを除外した後に、平均化処理を行う。これにより、アーチファクトの除外された１つの高画質な三次元モーションコントラスト画像を生成することができる。 Also, in order to generate high-quality 3D motion contrast data, at least two pieces of 3D data of the same region are acquired. FIG. 5(b) shows an example of a plurality of 3D motion contrast images. The motion contrast images Im520 to Im540 are three-dimensional motion contrast images, like the motion contrast image Im510 described with reference to FIG. 5(a). Using these two or more three-dimensional motion contrast images, alignment processing in the front direction (XY) and depth direction (Z) is performed, and after removing artifact data in each data, averaging processing I do. Thereby, one high-quality three-dimensional motion contrast image from which artifacts are removed can be generated.

一方、これに対応する低画質画像は、加算平均等の重ね合わせ処理を行う際の基準データとすることができる。第１の方法で説明したように、基準画像と加算平均後の画像とでは空間的な位置ずれがほとんどないため、容易に低画質画像と高画質画像のペアとすることができる。なお、基準データではなく位置合わせの画像変形処理を行った対象データから生成した任意の三次元モーションコントラスト画像を低画質画像としてもよい。 On the other hand, the corresponding low-quality image can be used as reference data for superimposition processing such as averaging. As described in the first method, since there is almost no spatial positional deviation between the reference image and the image after averaging, it is possible to easily pair the low-quality image and the high-quality image. An arbitrary three-dimensional motion contrast image generated from target data subjected to image deformation processing for alignment may be used as the low image quality image instead of the reference data.

第１の方法では、撮影自体が１回で終了するため被検者の負担は少ない。しかし、ＮＯＲの回数を増やすほど１回の撮影時間が長くなってしまう。また、撮影途中に目の混濁や睫毛などのアーチファクトが入った場合には必ずしも良い画像が得られるとは限らない。第２の方法では、複数回撮影を行うため被検者の負担は少し増えてしまう。しかし、１回の撮影時間が短く済むのと、１回の撮影でアーチファクトが入ったとしても、別の撮影でアーチファクトが写らなければ最終的にはアーチファクトの少ないきれいな画像を得ることができる。これらの特徴を鑑みて、データを集める際には被検者の状況に合わせて任意の方法を選択することができる。 In the first method, since the imaging itself is completed in one time, the burden on the subject is small. However, as the number of times of NOR is increased, the time taken for one shot becomes longer. Also, if artifacts such as cloudy eyes or eyelashes appear during imaging, it is not always possible to obtain a good image. In the second method, imaging is performed a plurality of times, which slightly increases the burden on the subject. However, one shooting time is short, and even if artifacts appear in one shooting, if no artifacts appear in another shooting, a clear image with few artifacts can finally be obtained. In view of these characteristics, any method can be selected according to the condition of the subject when collecting data.

本実施形態では、学習データとして用いる低画質画像と高画質画像としてモーションコントラスト画像を例に説明したが、学習データとして用いる画像はこれに限らない。モーションコントラストデータを生成するためにＯＣＴデータを取得しているため、ＯＣＴデータを用いて同様に低画質画像と高画質画像を生成することが可能である。さらに、本実施形態ではトラッキング処理について説明を省略したが、被検眼の略同一位置や略同一領域を撮影するため、被検眼のトラッキングを行いながら撮影を行うこともできる。トラッキング処理は公知の任意の方法によって行われてよい。 In the present embodiment, a motion contrast image is used as an example of a low image quality image used as learning data and a high image quality image, but the image used as learning data is not limited to this. Since the OCT data is acquired to generate the motion contrast data, it is possible to similarly generate the low quality image and the high quality image using the OCT data. Furthermore, although description of the tracking process is omitted in this embodiment, since substantially the same position or substantially the same area of the eye to be examined is photographed, the photographing can be performed while tracking the eye to be examined. Tracking processing may be performed by any known method.

三次元の高画質データと低画質データのペアを取得できた場合には、これらから任意の二次元画像のペアを生成することができる。例えば、生成した高画質な三次元モーションコントラスト画像について、所望の深度範囲で投影又は積算を行い、任意のＯＣＴＡ正面画像を生成することで、高画質なＯＣＴＡ平面画像を生成することができる。また、これに対応する低画質画像は、加算平均等の重ね合わせ処理を行う際の基準データから生成する任意のＯＣＴＡ正面画像とすることができる。この場合にも、基準画像と加算平均後の画像とでは空間的な位置ずれがほとんどないため、容易に低画質画像と高画質画像のペアとすることができる。なお、基準データではなく位置合わせの画像変形処理を行った対象データから生成した任意のモーションコントラスト正面画像を低画質画像としてもよい。 If a pair of three-dimensional high image quality data and low image quality data can be acquired, an arbitrary pair of two-dimensional images can be generated from these. For example, a high-quality OCTA planar image can be generated by projecting or integrating the generated high-quality three-dimensional motion contrast image in a desired depth range to generate an arbitrary OCTA frontal image. Also, the corresponding low-quality image can be an arbitrary OCTA front image generated from reference data when performing superimposition processing such as averaging. In this case also, since there is almost no spatial positional deviation between the reference image and the image after the averaging, it is possible to easily pair the low-quality image and the high-quality image. An arbitrary motion contrast front image generated from target data subjected to image transformation processing for alignment may be used as the low image quality image instead of the reference data.

このような学習データとして用いる二次元画像のペアの例について、図６（ａ）及び図６（ｂ）を参照してより詳細に説明する。例えば、学習データに用いる画像をＯＣＴＡ正面画像とする場合、上述のように、モーションコントラストに係る三次元ボリュームデータについて所望の深度範囲で投影又は積算を行うことで、ＯＣＴＡ正面画像を生成することができる。ここで、深度範囲とは、図５（ａ）及び図５（ｂ）に示すＺ方向における範囲である。 An example of a pair of two-dimensional images used as such learning data will be described in more detail with reference to FIGS. 6(a) and 6(b). For example, when the image used for learning data is an OCTA frontal image, as described above, the OCTA frontal image can be generated by projecting or integrating the three-dimensional volume data related to motion contrast in a desired depth range. can. Here, the depth range is the range in the Z direction shown in FIGS. 5(a) and 5(b).

図６（ａ）はＯＣＴＡ正面画像の例を示す。学習データに用いるＯＣＴＡ正面画像としては、表層（画像Ｉｍ６１０）、深層（画像Ｉｍ６２０）、外層（画像Ｉｍ６３０）、及び脈絡膜血管網（画像Ｉｍ６４０）など、異なる深度範囲で生成したＯＣＴＡ正面画像を用いることができる。なお、ＯＣＴＡ正面画像の種類はこれに限られず、基準となる層とオフセットの値を変えて異なる深度範囲を設定したＯＣＴＡ正面画像を生成して種類を増やしてもよい。学習を行う際には、異なる深さのＯＣＴＡ正面画像毎に別々に学習をしてもよいし、異なる深度範囲の画像を複数組み合わせて（例えば、表層側と深層側で分けて）学習してもよいし、全ての深度範囲のＯＣＴＡ正面画像を一緒に学習させるようにしてもよい。ＯＣＴデータから生成する輝度のＥｎ－Ｆａｃｅ画像を学習データに用いる場合も、ＯＣＴＡ正面画像と同様に、任意の深度範囲から生成した複数のＥｎ－Ｆａｃｅ画像を用いることができる。 FIG. 6A shows an example of an OCTA front image. As the OCTA front images used for learning data, OCTA front images generated in different depth ranges such as superficial layer (image Im610), deep layer (image Im620), outer layer (image Im630), and choroidal vascular network (image Im640) should be used. can be done. Note that the types of OCTA front images are not limited to this, and the number of types may be increased by generating OCTA front images in which different depth ranges are set by changing reference layers and offset values. When learning, each OCTA frontal image of different depths may be learned separately, or a plurality of images of different depth ranges may be combined (for example, divided into the surface layer side and the deep layer side) for learning. Alternatively, OCTA enface images of all depth ranges may be learned together. When using luminance En-Face images generated from OCT data as learning data, a plurality of En-Face images generated from an arbitrary depth range can be used as in the case of OCTA frontal images.

例えば、高画質化エンジンが、被検眼の異なる深度範囲に対応する複数のＯＣＴＡ正面画像を含む学習データを用いて得た学習済モデルを含む場合を考える。このとき、取得部１０１－１は、異なる深度範囲を含む長い深度範囲のうち一部の深度範囲に対応するＯＣＴＡ正面画像を第１の画像として取得することができる。すなわち、学習データに含まれる複数のＯＣＴＡ正面画像に対応する複数の深度範囲とは異なる深度範囲に対応するＯＣＴＡ正面画像を、高画質化処理時の入力画像とすることができる。もちろん、学習時と同じ深度範囲のＯＣＴＡ正面画像を、高画質化処理時の入力画像としてもよい。また、一部の深度範囲は、操作者がＵＩ上の任意のボタンを押す等に応じて設定されてもよいし、自動的に設定されてもよい。なお、上述した内容は、ＯＣＴＡ正面画像に限るものではなく、例えば、輝度のＥｎ－Ｆａｃｅ画像に対しても適用することができる。 For example, consider a case where the image quality enhancement engine includes a trained model obtained using training data including a plurality of OCTA frontal images corresponding to different depth ranges of the subject's eye. At this time, the acquisition unit 101-1 can acquire, as the first image, an OCTA front image corresponding to a partial depth range of the long depth range including different depth ranges. That is, an OCTA frontal image corresponding to a depth range different from the plurality of depth ranges corresponding to the plurality of OCTA frontal images included in the learning data can be used as an input image for image quality enhancement processing. Of course, the OCTA front image in the same depth range as that used during learning may be used as the input image during image quality enhancement processing. Also, a part of the depth range may be set according to the operator pressing an arbitrary button on the UI, or may be set automatically. Note that the above-described content is not limited to the OCTA front image, and can also be applied to, for example, a luminance En-Face image.

なお、学習済モデルの処理対象の画像が断層画像である場合、Ｂスキャン画像であるＯＣＴ断層画像やモーションコントラストデータの断層画像を学習データとして用いて学習を行う。これに関して、図６（ｂ）を参照して説明する。図６（ｂ）において、画像Ｉｍ６５１～画像Ｉｍ６５３はＯＣＴ断層画像（輝度の断層画像）である。図６（ｂ）において画像が異なるのは、副走査（Ｙ）方向の位置が異なる場所の断層画像を示しているからである。断層画像においては、副走査方向の位置の違いを気にせずに一緒に学習するようにしてもよい。ただし、撮影部位（例えば、黄斑部中心や視神経乳頭部中心）が異なる場所を撮影した画像の場合には、部位ごとに別々に学習するようにしてもよいし、撮影部位を気にせずに一緒に学習するようにしてもよい。なお、ＯＣＴ断層画像と、モーションコントラストデータの断層画像においては画像特徴量が大きく異なるので別々に学習を行う方がよい。 Note that when the image to be processed by the trained model is a tomographic image, learning is performed using an OCT tomographic image, which is a B-scan image, or a tomographic image of motion contrast data as learning data. This will be described with reference to FIG. 6(b). In FIG. 6B, images Im651 to Im653 are OCT tomographic images (luminance tomographic images). The reason why the images are different in FIG. 6B is that the tomographic images are shown at different positions in the sub-scanning (Y) direction. In the tomographic image, the learning may be performed together without worrying about the difference in position in the sub-scanning direction. However, in the case of images taken at different locations (for example, the center of the macula or the center of the optic papilla), each region may be learned separately, or they may be learned together regardless of the location. You may make it learn to. Since the OCT tomographic image and the tomographic image of the motion contrast data differ greatly in image feature amount, it is better to perform learning separately.

学習データの出力データとして用いられる高画質画像としては、例えば、上述のように重ね合わせ画像を用いることができる。重ね合わせ処理を行った重ね合わせ画像は、元画像群で共通して描出された画素が強調されるため、画像診断に適した高画質画像になる。この場合には、生成される高画質画像は、共通して描出された画素が強調された結果、低輝度領域と高輝度領域との違いがはっきりした高コントラストな画像になる。また、例えば、重ね合わせ画像では、撮影毎に発生するランダムノイズが低減されたり、ある時点の元画像ではうまく描出されなかった領域が他の元画像群によって補間されたりすることができる。 As a high-quality image used as output data for learning data, for example, a superimposed image can be used as described above. A superimposed image that has undergone the superimposing process has a high-quality image that is suitable for image diagnosis because the pixels commonly drawn in the original image group are emphasized. In this case, the generated high-quality image is a high-contrast image in which the difference between the low-brightness region and the high-brightness region is clear as a result of the commonly drawn pixels being emphasized. Also, for example, in the superimposed image, random noise that occurs each time an image is captured can be reduced, and an area that was not well rendered in the original image at a certain point in time can be interpolated with another original image group.

さらに、重ね合わせ画像を学習データの出力データとする場合、重ね合わせ画像から学習データの入力データとして用いる低画質画像を生成することもできる。この場合には、例えば、重ね合わせ画像を一度ダウンサンプリングで低解像化し、低解像度化した画像を既知の方法（ニアレストネイバー法、バイリニア法など）でアップサンプリングを行ったものを学習データの入力データとすることができる。このような画像のペア群を学習データとして用いて学習を行うことで解像感を向上する高画質化エンジンを構成することも可能である。 Furthermore, when a superimposed image is used as output data for learning data, a low image quality image to be used as input data for learning data can be generated from the superimposed image. In this case, for example, the superimposed image is once down-sampled to a low resolution, and the low-resolution image is up-sampled by a known method (nearest neighbor method, bilinear method, etc.) and used as training data. Can be input data. It is also possible to configure an image quality enhancement engine that improves the sense of resolution by performing learning using such a pair group of images as learning data.

また、機械学習モデルの入力データを複数の画像で構成する必要がある場合には、元画像群から必要な数の元画像群を選択し、入力データとすることができる。例えば、１５枚の元画像群から１枚の重ね合わせ画像を得る場合において、機械学習モデルの入力データとして２枚の画像が必要であれば、１０５（１５Ｃ２＝１０５）のペア群を生成可能である。 Further, when it is necessary to configure the input data of the machine learning model with a plurality of images, a necessary number of original image groups can be selected from the original image group and used as input data. For example, when obtaining one superimposed image from a group of 15 original images, if two images are required as input data for a machine learning model, 105 (15C2=105) pair groups can be generated. be.

なお、学習データを構成するペア群のうち、高画質化に寄与しないペアは学習データから取り除くことができる。例えば、学習データのペアを構成する出力データである高画質画像が画像診断に適さない画質である場合には、当該学習データを用いて学習した高画質化エンジンが出力する画像も画像診断に適さない画質になってしまう可能性がある。そのため、出力データが画像診断に適さない画質であるペアを学習データから取り除くことで、高画質化エンジンが画像診断に適さない画質の画像を生成する可能性を低減させることができる。 Note that pairs that do not contribute to high image quality can be removed from the learning data in the pair group that constitutes the learning data. For example, if a high-quality image, which is output data that constitutes a pair of learning data, has an image quality that is not suitable for image diagnosis, the image output by the image quality enhancement engine trained using the learning data is also not suitable for image diagnosis. image quality may be poor. Therefore, by removing pairs whose output data has an image quality unsuitable for image diagnosis from the learning data, it is possible to reduce the possibility that the image quality enhancement engine generates an image with an image quality unsuitable for image diagnosis.

また、ペアである画像群の平均輝度や輝度分布が大きく異なる場合には、当該学習データを用いて学習した高画質化エンジンが、低画質画像と大きく異なる輝度分布を持つ画像診断に適さない画像を出力する可能性がある。このため、平均輝度や輝度分布が大きく異なる入力データと出力データのペアを学習データから取り除くこともできる。 In addition, when the average brightness and brightness distribution of the pair of images are significantly different, the image quality improvement engine trained using the training data will not be able to detect images that are not suitable for image diagnosis with brightness distributions that are significantly different from those of low-quality images. may be output. For this reason, pairs of input data and output data having significantly different average luminances and luminance distributions can be removed from the learning data.

さらに、ペアである画像群に描画される撮影対象の構造や位置が大きく異なる場合には、当該学習データを用いて学習した高画質化エンジンが、低画質画像と大きく異なる構造や位置に撮影対象を描画した画像診断に適さない画像を出力する可能性がある。このため、描画される撮影対象の構造や位置が大きく異なる入力データと出力データのペアを学習データから取り除くこともできる。また、高画質化エンジンについて、品質保持の観点から、自身が出力する高画質画像を学習データとして用いないように構成することができる。 Furthermore, if the structures and positions of the shooting targets drawn in the pair of images are significantly different, the high-quality image engine trained using the learning data will draw the shooting targets in structures and positions that are significantly different from those of the low-quality images. There is a possibility of outputting an image that is not suitable for diagnostic imaging. For this reason, pairs of input data and output data in which the structures and positions of the objects to be drawn are significantly different can be removed from the learning data. In addition, from the viewpoint of maintaining quality, the image quality improvement engine can be configured not to use the high quality image it outputs as learning data.

このように学習を行った高画質化エンジンを用いることで、高画質化処理部３０１は、一回の撮影で取得された医用画像が入力された場合に、重ね合わせ処理によって高コントラスト化やノイズ低減等が行われたような高画質画像を出力することができる。このため、高画質化処理部３０１は、入力画像である低画質画像に基づいて、画像診断に適した高画質画像を生成することができる。 By using the image quality improvement engine that has learned in this way, the image quality improvement processing unit 301 performs superimposition processing to increase contrast and reduce noise when a medical image acquired in a single imaging process is input. It is possible to output a high-quality image that seems to have undergone reduction or the like. Therefore, the image quality enhancement processing unit 301 can generate a high quality image suitable for image diagnosis based on the low quality image that is the input image.

なお、ここでは学習データの出力データとして重ね合わせ画像を用いる例について説明したが、高画質化エンジンの学習データの出力データはこれに限られない。学習データの出力データは、入力データに対応する高画質画像であればよく、例えば、診断に適するようにノイズ低減された画像や、コントラスト補正を行った画像、高解像度化した画像、より工数の多い撮影条件で撮影した画像等であってもよい。また、入力データとして用いる低画質画像に最大事後確率推定（ＭＡＰ推定）処理などの統計処理を用いた画像処理を施した画像を、学習データの出力データとして用いることもできる。なお、高画質画像の生成方法は、公知の任意の手法を用いてよい。 Although an example in which a superimposed image is used as output data of learning data has been described here, output data of learning data of the image quality enhancement engine is not limited to this. The output data of the learning data may be a high-quality image corresponding to the input data. It may be an image or the like shot under many shooting conditions. Also, an image obtained by subjecting a low-quality image used as input data to image processing using statistical processing such as maximum a posteriori probability estimation (MAP estimation) processing can be used as output data for learning data. Note that any known method may be used as a method for generating a high-quality image.

また、高画質化エンジンとしては、ノイズ低減やコントラスト補正、さらに高解像度化など種々の高画質化処理をそれぞれ単独で行う複数の高画質化エンジンを用意してもよい。また、少なくとも２つの高画質化処理を行うひとつの高画質化エンジンを用意してもよい。なお、これらの場合には、学習データの出力データとしては、所望の処理に応じた高画質化画像を用いればよい。例えば、個々の処理を行う高画質化エンジンに関しては、ノイズ低減処理等の個々の処理を施した高画質画像を学習データの出力データとすればよい。また、複数の高画質化処理を行う高画質化エンジンに関しては、例えば、ノイズ低減処理及びコントラスト補正処理等を施した高画質画像を学習データの出力データとすればよい。 Also, as the image quality improvement engine, a plurality of image quality improvement engines may be prepared that individually perform various image quality improvement processes such as noise reduction, contrast correction, and resolution enhancement. Also, one image quality enhancement engine that performs at least two image quality enhancement processes may be prepared. In these cases, a high-quality image corresponding to desired processing may be used as the output data of the learning data. For example, with respect to an image quality enhancement engine that performs individual processing, a high image quality image that has been subjected to individual processing such as noise reduction processing may be used as output data for learning data. As for the image quality enhancement engine that performs a plurality of image quality enhancement processes, for example, a high quality image subjected to noise reduction processing, contrast correction processing, etc. may be used as output data of learning data.

（医用画像の領域検出エンジン）
次に、領域検出部３０２が用いる医用画像の領域検出エンジンについて、コントラスト正面画像を例にして説明する。本実施形態に係る領域検出部３０２は、コントラスト正面画像において、例えば、血流の有無を確認するため、灌流領域と無灌流領域に分類して領域を検出する。以後、無灌流領域をＮＰＡ（ＮｏｎＰｅｒｆｕｓｉｏｎＡｒｅａ）と表記する。また、本実施形態に係る領域検出部３０２は、例えば、中心窩無血管領域（ＦＡＺ：ＦｏｖｅａｌＡｖａｓｃｕｌａｒＺｏｎｅ）や視神経乳頭部（ＯＮＨ：ＯｐｔｉｃＮｅｒｖｅＨｅａｄ）を領域として検出してもよい。検出した領域の情報としては、当該領域のラベルを画像内の画素に関連付けることができる。このように画像内の全画素にラベルを関連付ける深層学習は、セマンティックセグメンテーションと呼ばれる。 (Area detection engine for medical images)
Next, a medical image area detection engine used by the area detection unit 302 will be described with a contrast front image as an example. The region detection unit 302 according to the present embodiment classifies the contrast front image into perfusion regions and non-perfusion regions and detects regions in order to confirm the presence or absence of blood flow, for example. Hereinafter, the non-perfusion area is referred to as NPA (Non Perfusion Area). Further, the region detection unit 302 according to the present embodiment may detect, for example, a foveal avascular zone (FAZ) or an optic nerve head (ONH) as a region. For the detected region information, the label of the region can be associated with the pixels in the image. Deep learning that associates labels with all pixels in an image in this way is called semantic segmentation.

一般に深層学習において、学習データの出力データ（正解データ）を作成する作業をアノテーションと呼ぶ。領域検出用の学習済モデルに係る学習データに関しては、アノテーションによって、入力データ又は対応する画像の各画素位置に対して分類する領域を示すラベルを与えることで出力データを作成していくことができる。このようにして生成した各領域のラベルを示す情報を画素値として有する画像を領域ラベル画像という。アノテーション作業は、すべてを手動で行ってもよいし、一部を自動で行ってもよい。一定の作業が終わった時点で深層学習を随時行っていき、途中段階の領域検出エンジンによるセグメンテーション結果を参考にしながらラベルを修正することで、アノテーション作業の効率をあげることもできる。なお、領域ラベル画像を作成する場合には、入力データだけでなく他の情報を用いてもよい。例えば、入力データを高画質画像とした学習データの出力データとなる領域ラベル画像を作成する際に、高画質化される前の画像等を用いて、例えば参照して領域ラベル画像を作成してもよい。 Generally, in deep learning, the work of creating output data (correct data) of learning data is called annotation. With regard to learning data related to a trained model for area detection, it is possible to create output data by giving a label indicating an area to be classified to each pixel position of the input data or the corresponding image by annotation. . An image having, as a pixel value, information indicating the label of each area generated in this way is called an area label image. The annotation work may be performed entirely manually or partially automatically. The efficiency of annotation work can be improved by performing deep learning as needed after a certain amount of work is completed, and correcting the labels while referring to the segmentation results of the region detection engine in the middle stage. It should be noted that other information may be used in addition to the input data when creating an area label image. For example, when creating an area label image to be output data of learning data whose input data is a high-quality image, an image before high-quality enhancement is used, for example, to create an area label image by referring to the image. good too.

このように所定枚数の医用画像である入力データと、医用画像に対応する領域ラベル画像である出力データのペア群とする学習データを用いて機械学習を行った学習済モデルを領域検出エンジンとして用いることができる。なお、このように学習を行った領域検出エンジンでは、各画素についての領域のラベルについての確からしさ（信頼度、確率）を出力することができる。そのため、領域検出部３０２は、領域検出エンジンから出力されたラベルの確率について、例えば各ラベルの中で、他のラベルよりも高い確率のラベルを領域の検出結果として出力することができる。また、各ラベルのうち閾値よりも高い確率のラベルを検出結果として出力することができる。このとき、閾値よりも高い確率のラベルが複数ある場合には、それらすべてを出力してもよいし、そのうちの他のラベルよりも高い確率のラベルを検出結果として出力してもよい。さらに、領域検出部３０２は、学習済モデルを用いて得た各ラベルの確率から、機械学習モデルを用いて、検出結果を決定してもよい。この場合に用いる機械学習アルゴリズムは、ラベルの確率の取得に用いられた機械学習アルゴリズムとは異なる種類の機械学習アルゴリズムであってもよく、例えば、ニューラルネットワーク、サポートベクターマシン、アダブースト、ベイジアンネットワーク、又はランダムフォレスト等であってよい。 In this way, a trained model that has been machine-learned using learning data that is a pair group of a predetermined number of input data, which is a medical image, and output data, which is a region label image corresponding to the medical image, is used as a region detection engine. be able to. Note that the area detection engine that has learned in this way can output the certainty (reliability, probability) of the area label for each pixel. Therefore, the region detection unit 302 can output a label with a higher probability than other labels as the region detection result, for example, among the labels output from the region detection engine. In addition, it is possible to output a label with a probability higher than the threshold among the labels as a detection result. At this time, if there are multiple labels with probabilities higher than the threshold, all of them may be output, or a label with a higher probability than the other labels may be output as the detection result. Furthermore, the region detection unit 302 may determine the detection result using a machine learning model from the probability of each label obtained using the trained model. The machine learning algorithm used in this case may be a different type of machine learning algorithm than the one used to obtain the label probabilities, such as neural networks, support vector machines, Adaboost, Bayesian networks, or It may be a random forest or the like.

なお、高画質化エンジンと領域検出エンジンとして、共通のモデルを使ってもよい。この場合には、例えば、既知のＵ－Ｎｅｔモデルなどが利用でき、高画質化エンジンと領域検出エンジンを同じにすることで、パラメータの差し替えだけでそれぞれの推論処理（推定処理）を行うことができる。 Note that a common model may be used as the image quality enhancement engine and the area detection engine. In this case, for example, a known U-Net model can be used, and by using the same image quality enhancement engine and area detection engine, each inference process (estimation process) can be performed simply by replacing parameters. can.

また、高画質化エンジンや領域検出エンジンに利用可能な既知のＵ－Ｎｅｔモデルなどは、所定の画像データのサイズで学習を行うため、学習済モデルに入力する画像データは学習時と同じサイズで推論（推定）を行う必要がある。入力画像が学習済モデルに入力する画像サイズより大きい場合、高画質化処理部３０１は、入力画像を複数のサブセット領域に分割して（図１７）、それぞれの領域に対して推論処理を実行して合成する。また、推論領域の一部を有効領域として設定する場合には、非有効領域をマージン領域として設定し、サブセット領域はマージン領域をオーバーラップするように設定してもよい。なお、マージン領域を設定するときは、入力画像をマージン分だけあらかじめ領域を拡大しておけばよい（不図示）。拡大方法はミラーリングなどの一般的な処理が利用できる。 In addition, known U-Net models that can be used for image quality enhancement engines and region detection engines perform training with a predetermined image data size, so the image data input to the trained model is the same size as during training. It is necessary to make an inference (estimation). If the input image is larger than the image size input to the trained model, the image quality enhancement processing unit 301 divides the input image into a plurality of subset regions (FIG. 17) and performs inference processing on each region. to synthesize. Also, when a part of the inference area is set as the valid area, the non-valid area may be set as the margin area, and the subset area may be set so as to overlap the margin area. When setting the margin area, the area of the input image may be expanded in advance by the margin (not shown). General processing such as mirroring can be used as the enlargement method.

一方で、ネットワークの種類によっては、サブセットのサイズを学習時のサイズから変更することができる。ダウンサンプリング（Ｄｏｗｎｓａｍｐｌｉｎｇ）層やアップサンプリング（Ｕｐｓａｍｐｌｉｎｇ）層を含むネットワークの場合、一定の制約はあるが、その制約の下で推論するサブセットのサイズを拡大すれば、推論処理の回数を減らすことで高速化が可能となる。このとき、高画質化処理部３０１は、学習時とは異なるサイズで入力画像をサブセット領域に分割して推論処理を実行する。逆に、学習時の画像データのサイズは小さく設定することで、学習データの数を増やすことが可能となる効果もある。また、学習済モデルが、第１の画像サイズを有する医用画像である入力データを含む学習データにより学習して得られる場合を考える。このとき、推論時には、第１の画像サイズよりも大きい第２の画像サイズを有する医用画像である第１の画像を入力データとして学習済モデルに入力する。これにより、画像処理部は、第２の画像サイズを有する第２の画像を、学習済モデルの出力データとして出力することができる。そして、学習済モデルの入力データとして、学習と推論（推定）とに適した画像サイズの医用画像を用いることができる。具体的には、学習データの入力データの画像サイズが推論時の入力データの画像サイズよりも小さいため、学習データの数を多くすることができる。一方で、推論時の入力データの画像サイズが学習データの入力データの画像サイズよりも大きいため、推論処理を高速化することができる。もちろん、学習時と推論時でサブセット領域のサイズが大きく乖離することは好ましくない結果をもたらす場合もあるが、あらかじめサブセット領域のサイズ変更による推論結果の差を確認して許容できる範囲であるかを確認しておけばよい。例えば、２２４×２２４画素を入力サイズとし、上下左右それぞれ１２画素のマージン領域を設定するモデルに対して、２３２×２３２画素の入力画像を推論する場合は、サブセット領域を４つ設定すれば有効領域による推論が行える。これに対して、学習モデルの入力サイズを２５６×２５６として推論時だけ拡大すれば、サブセット領域は一つでよく推論処理が１回で完了する。一般的には、バッチ処理と呼ばれる推論処理を効率化するアーキテクチャが利用できるため、バッチ処理数に応じてサブセット領域の数を設定することで推論処理を効率化してもよい。このとき、推論時における学習済モデルの入力データは、第２の画像サイズよりも大きい第３の画像サイズを有する医用画像を複数の画像に分割して得た第２の画像サイズを有する複数の医用画像であってもよい。この場合には、学習済モデルの出力データとして複数の画像が出力される。そして、出力された複数の画像を合成して第３の画像サイズの画像が生成されてもよい。 On the other hand, depending on the type of network, the subset size can be changed from the training size. In the case of a network that includes a downsampling layer or an upsampling layer, although there are certain constraints, if the size of the subset to be inferred under those constraints is increased, the number of inference processes can be reduced, thereby increasing the speed. becomes possible. At this time, the image quality improvement processing unit 301 divides the input image into subset regions with a size different from that at the time of learning, and executes inference processing. Conversely, by setting the size of the image data at the time of learning to be small, there is an effect that the number of learning data can be increased. Also, consider a case where the trained model is obtained by learning with learning data including input data that is a medical image having a first image size. At this time, during inference, the first image, which is a medical image having a second image size larger than the first image size, is input to the trained model as input data. Thereby, the image processing section can output the second image having the second image size as the output data of the trained model. A medical image having an image size suitable for learning and inference (estimation) can be used as input data for the trained model. Specifically, since the image size of the input data of the learning data is smaller than the image size of the input data at the time of inference, the number of learning data can be increased. On the other hand, since the image size of the input data at the time of inference is larger than the image size of the input data of the learning data, the inference processing can be speeded up. Of course, a large discrepancy in the size of the subset area between training and inference may lead to undesirable results, but it is necessary to check the difference in the inference result due to the size change of the subset area in advance to see if it is within an acceptable range. You should check. For example, when inferring an input image of 232×232 pixels for a model with an input size of 224×224 pixels and setting margin areas of 12 pixels each on the top, bottom, left, and right, setting four subset areas will make the effective area can be inferred by On the other hand, if the input size of the learning model is set to 256×256 and expanded only at the time of inference, only one subset area is required and the inference process is completed in one time. In general, an architecture for streamlining inference processing called batch processing can be used, so inference processing may be streamlined by setting the number of subset regions according to the number of batch processing. At this time, the input data of the learned model at the time of inference is a plurality of images each having a second image size obtained by dividing a medical image having a third image size larger than the second image size into a plurality of images. It may be a medical image. In this case, a plurality of images are output as the output data of the trained model. Then, an image of the third image size may be generated by synthesizing the plurality of output images.

さらに、領域検出を行う入力データとしては、前述した高画質画像（重ね合わせ画像）を用いてもよいし、入力される医用画像に対して高画質化エンジンを適用した高画質画像を入力データとして用いてもよい。このように、入力データに応じた学習を行うことで、それぞれの入力データに最適な領域検出エンジンを構成することができる。 Furthermore, as input data for region detection, the high-quality image (superimposed image) described above may be used, or a high-quality image obtained by applying an image quality enhancement engine to an input medical image may be used as input data. may be used. In this way, by performing learning according to input data, it is possible to configure an optimal area detection engine for each input data.

領域検出エンジンの一例として既知のＵＮｅｔモデルを示したが、Ｅｎｃｏｄｅｒ－Ｄｅｃｏｄｅｒ型Ｔｒａｎｓｆｏｒｍｅｒを利用したＤＥＴＲ（ＤｅｔｅｃｔｉｏｎＴｒａｎｓｆｏｒｍｅｒ）によって領域検出を行ってもよい。図１８にＴｒａｎｓｆｏｒｍｅｒで利用されるＳｅｌｆ－Ａｔｔｅｎｔｉｏｎの一例を示す。図１８（ａ）に示すように、入力画像のＣｏｎｖｏｌｕｔｉｏｎの出力からさらにＯｎｅｂｙＯｎｅＣｏｎｖｏｌｕｔｉｏｎなどを用いて、Ｑｕｅｒｙ、ｋｅｙ、Ｖａｌｕｅの３つを作成する。図１８（ｂ）に示すように左上のピクセルに着目してＱｕｅｒｙとＫｅｙの内積を求めてＳｏｆｔｍａｘをとることで、左上のピクセルをどのピクセルで特徴づけるべきなのかを決定する。そして、ＶａｌｕｅをこのＳｏｆｔｍａｘの結果に従ってすべてのピクセルを重みづけしながら足し合わせる。この一連の処理をすべてのピクセルに対して行ったものを入力に足し合わせて出力する。すなわち、自分自身にＡｔｔｅｎｔｉｏｎ（注意）を行ってその結果を自分自身に反映する仕組みがＳｅｌｆ－Ａｔｔｅｎｔｉｏｎであり、Ｓｅｌｆ－Ａｔｔｅｎｔｉｏｎを用いることで、自分自身をほかの場所の特徴に着目して自分自身を特徴づける仕組みを実現できる。このようにピクセルごとのＳｅｌｆ－Ａｔｔｅｎｔｉｏｎを行うことで、例えばＦＡＺ／ＮＰＡ／血管のそれぞれの画素が、画像中の離れた位置にある、あるいは位置のバリエーションが多くなりすぎる場合でもうまく特徴づけることが可能になる。従来のＣＮＮでは、離れた位置の画素の特徴を取り込むためには、階層をより深くする、フィルタサイズを大きくする、あるいは、ＤｉｌａｔｅｄＣｏｎｖｏｌｕｔｉｏｎを用いて参照範囲を広げるなどの工夫が必要になるが演算コストが増大するという課題があった。このようにＳｅｌｆ－Ａｔｔｅｎｔｉｏｎを利用することで画像処理における物体認識や領域検出においてもＴｒａｎｓｆｏｒｍｅｒを利用したモデルを構築することで演算コストも抑制できる。Ｔｒａｎｓｆｏｒｍｅｒは画素あるいは画素領域の位置に一意の値を与えてながら学習するため、結果として位置の依存関係も学習していることになる。どこに着目すべきかを動的に変えるＡｔｔｅｎｔｉｏｎによってニューラルネットワークの接続を入力データに従って動的に変更するというのは非常に汎用性の高い方法であり、すなわち、画素あるいは画素領域の特徴量同士の関係性マップから無灌流領域を推定することに応用し、前述の高画質化処理を適用してもよい。また、ＤＥＴＲとＵｎｅｔモデルを組み合わせて領域検出を行ってもよい。すなわち、ＤＥＴＲで無灌流領域を絞った後に、Ｕｎｅｔモデルによるセマンティックセグメンテーションを実施してもよい。逆にＵｎｅｔモデルのセマンティックセグメンテーション処理後に、さらにＤＥＴＲで詳細なラベル付けを実施してもよい。 Although the known UNet model is shown as an example of the region detection engine, region detection may be performed by DETR (Detection Transformer) using Encoder-Decoder type Transformer. FIG. 18 shows an example of Self-Attention used in Transformer. As shown in FIG. 18A, Query, Key, and Value are created from the output of Convolution of the input image using One by One Convolution or the like. As shown in FIG. 18B, focusing on the upper left pixel, the inner product of Query and Key is obtained and Softmax is taken to determine which pixel should characterize the upper left pixel. Then, the values are added while weighting all the pixels according to the result of this Softmax. This series of processing is performed on all pixels, and the result is added to the input and output. In other words, Self-Attention is the mechanism of giving attention to oneself and reflecting the result on oneself. It is possible to realize a mechanism that characterizes By performing this pixel-by-pixel self-attention, for example, individual FAZ/NPA/vessel pixels can be well characterized even if they are located far apart in the image, or if their positions vary too much. be possible. In conventional CNN, in order to capture the features of pixels at distant positions, it is necessary to make the hierarchy deeper, increase the filter size, or use dilated convolution to widen the reference range. There was a problem that the cost increased. By using Self-Attention in this way, even in object recognition and area detection in image processing, calculation costs can be suppressed by constructing a model using a Transformer. Since the Transformer learns while giving unique values to the positions of pixels or pixel regions, it also learns positional dependencies as a result. Dynamically changing the connection of the neural network according to the input data by dynamically changing where to focus attention is a very versatile method. It may be applied to estimation of the non-perfusion region from the map, and the image quality improvement processing described above may be applied. Alternatively, region detection may be performed by combining DETR and the Unet model. That is, semantic segmentation by the Unet model may be performed after narrowing down the non-perfused region by DETR. Conversely, after the semantic segmentation processing of the Unet model, detailed labeling may be further performed by DETR.

なお、領域検出エンジンは、必ずしも機械学習による人工知能エンジンである必要はなく、従来の各種フィルタ処理等を含めた、被検眼の構造等に基づくルールベースのアルゴリズム等を用いて構成されてもよい。そのため、領域検出部３０２が用いる領域検出エンジンは、例えば、既知のガウスフィルタ等のフィルタ処理を組み合わせてＦＡＺやＮＰＡ領域を抽出してもよいし、ＧＵＩ等を介した操作者による指示に応じて領域を抽出してもよい。また、領域検出エンジンは、例えば、モーションコントラスト画像において、被検眼の構造や所定の閾値を用いた閾値処理により、灌流領域と無灌流領域に分類して領域を検出してもよい。 Note that the region detection engine does not necessarily have to be an artificial intelligence engine based on machine learning, and may be configured using a rule-based algorithm based on the structure of the subject's eye, including conventional various filtering processes. . Therefore, the area detection engine used by the area detection unit 302 may, for example, extract the FAZ or NPA area by combining filter processing such as a known Gaussian filter, or according to an operator's instruction via a GUI or the like Regions may be extracted. Further, the region detection engine may classify the motion contrast image into a perfused region and a non-perfused region and detect the region, for example, by threshold processing using the structure of the eye to be inspected and a predetermined threshold value.

（医用画像の画像処理手順）
次に、図７を参照して本実施形態に係る画像処理装置１０１による一連の画像処理の手順について説明する。図７は、本実施形態に係る画像処理の概略的な流れを示すフロー図である。 (Image processing procedure for medical images)
Next, a series of image processing procedures by the image processing apparatus 101 according to this embodiment will be described with reference to FIG. FIG. 7 is a flowchart showing a schematic flow of image processing according to this embodiment.

本実施形態に係る画像処理が開始されると、ステップＳ７０１において、取得部１０１－１が医用画像を取得する。取得部１０１－１は、撮影装置１００や外部装置から医用画像を取得してもよいし、これらから取得した信号データを用いて生成された医用画像を取得してもよい。本実施形態では、取得部１０１－１は、医用画像として例えばコントラスト正面画像を取得する。なお、医用画像はこれに限られず、断層画像や輝度のＥｎ－Ｆａｃｅ画像、ＳＬＯ画像、被検体の医用画像を解析して得た解析画像（解析マップ）等であってもよい。 When the image processing according to this embodiment is started, the acquisition unit 101-1 acquires a medical image in step S701. The acquisition unit 101-1 may acquire a medical image from the imaging device 100 or an external device, or may acquire a medical image generated using signal data acquired from these devices. In this embodiment, the acquisition unit 101-1 acquires, for example, a contrast front image as a medical image. The medical image is not limited to this, and may be a tomographic image, a brightness En-Face image, an SLO image, an analysis image (analysis map) obtained by analyzing a medical image of a subject, or the like.

ステップＳ７０２においては、高画質化処理部３０１が、取得された医用画像に対して、前述した少なくとも１つの高画質化エンジンを用いて、第１の画像処理である高画質化処理を適用し高画質化された医用画像を取得する。本実施形態に係る高画質化処理部３０１は、高画質化エンジンを用いて、ＯＣＴＡ正面画像から高画質化されたＯＣＴＡ正面画像を取得する。 In step S702, the image quality enhancement processing unit 301 applies image quality enhancement processing, which is the first image processing, to the acquired medical image using at least one image quality enhancement engine described above. To obtain a medical image that has been improved in quality. The image quality enhancement processing unit 301 according to the present embodiment uses an image quality enhancement engine to obtain an OCTA frontal image whose image quality has been enhanced from the OCTA frontal image.

次に、ステップＳ７０３において、領域検出部３０２がステップＳ７０１で取得された医用画像又はステップＳ７０２で取得された高画質化された医用画像に対して、前述した領域検出エンジンを用いて領域検出処理を適用する。領域検出部３０２は、当該領域検出処理により、医用画像又は高画質化された医用画像における少なくとも２つの領域を検出し、各領域を示す情報を取得する。例えば、領域検出部３０２は、検出した領域に関してそれぞれの領域を示すラベルを各画素の情報とした領域ラベル画像を取得してもよいし、医用画像について各画素の情報にそれぞれの領域を示すラベルを加えてもよい。なお、領域検出部３０２は、検出した領域を示す情報を出力できればよく、検出した領域を示す情報は上述した形式に限られない。例えば、領域検出部３０２は、医用画像の画素情報に対応付けられたラベルの情報等であってよい。また、前述のように、領域検出エンジンから出力されたそれぞれの属性に関する確からしさ（信頼度、確率）を示す値を医用画像の各画素に対応付けて画素の情報に加えてもよい。本実施形態では、領域検出エンジンを用いて、ＯＣＴＡ正面画像又は高画質化されたＯＣＴＡ正面画像における灌流領域と無灌流領域（ＮＰＡ）を検出する。なお、医用画像又は高画質化された医用画像における検出された領域の画素位置は、医用画像及び高画質化された医用画像において対応しているものとすることができる。 Next, in step S703, the region detection unit 302 performs region detection processing on the medical image acquired in step S701 or the high-quality medical image acquired in step S702 using the aforementioned region detection engine. Apply. The region detection unit 302 detects at least two regions in the medical image or the high-quality medical image by the region detection processing, and acquires information indicating each region. For example, the region detection unit 302 may acquire a region label image in which the label indicating each region of the detected region is information of each pixel, or a label indicating each region in the information of each pixel of the medical image. may be added. Note that the area detection unit 302 only needs to output information indicating the detected area, and the information indicating the detected area is not limited to the format described above. For example, the region detection unit 302 may be label information associated with pixel information of a medical image. Further, as described above, the value indicating the likelihood (reliability, probability) of each attribute output from the region detection engine may be associated with each pixel of the medical image and added to the pixel information. In this embodiment, a region detection engine is used to detect perfused and non-perfused regions (NPA) in an OCTA enface image or an enhanced OCTA enface image. Note that the pixel positions of the detected regions in the medical image or the high-quality medical image can correspond to each other in the medical image and the high-quality medical image.

領域検出部３０２が検出した領域を示す情報を出力すると、ＲＯＩ設定部３０３が、当該出力された情報に基づいて、高画質化された医用画像におけるＲＯＩを設定する。なお、ＲＯＩは、領域検出部３０２によって検出された領域のうちの少なくとも１つの領域について設定されてよい。本実施形態においては、ＲＯＩ設定部３０３は、高画質化されたＯＣＴＡ正面画像におけるＮＰＡについてＲＯＩを設定する。なお、上述のように、ＲＯＩの設定は領域検出部３０２によって行われてもよい。 When the region detection unit 302 outputs information indicating the detected region, the ROI setting unit 303 sets the ROI in the high-quality medical image based on the output information. Note that the ROI may be set for at least one of the areas detected by the area detection unit 302 . In this embodiment, the ROI setting unit 303 sets the ROI for the NPA in the OCTA frontal image with improved image quality. Note that the ROI may be set by the region detection unit 302 as described above.

最後に、ステップＳ７０４において、ブレンド処理部３０４又はＢＣ調整部３０５によって、高画質化された医用画像におけるＲＯＩに対して、ＲＯＩの画素値とＲＯＩ以外の領域の画素値との差が広がるように第２の画像処理を適用する。このとき、ブレンド処理部３０４又はＢＣ調整部３０５は、高画質化された医用画像におけるＲＯＩに対して、ＲＯＩの画素値が第２の画像処理前のＲＯＩの画素値より低くなるように、第２の画像処理を適用する。例えば、ブレンド処理部３０４は、高画質化された医用画像におけるＲＯＩに対して、第２の画像処理として、ステップＳ７０１で取得された医用画像及びステップＳ７０２で取得された高画質化された医用画像のブレンド処理を行う。ブレンド処理は公知の任意の手法を用いて行われてよく、ブレンド比率は所定の比率であってもよいし、操作者の指示に応じて設定されてもよい。当該ブレンド処理が行われることで、学習済モデルを用いた高画質化処理による過補正が生じている領域について、過補正が生じる前の画像と過補正が生じている画像がブレンドされ、過補正が生じている領域の画素値とそれ以外の領域の画素値との差が広がる。言い換えると、当該ブレンド処理が行われることで、過補正が生じている領域の画素値が低くなる。このような処理では、画像全体の画素値を低くすることなく、過補正が生じている領域のみ画素値を低くすることができる。このため、例えば、ＮＰＡやＦＡＺ等の領域における過補正を抑制することができる。 Finally, in step S704, the blend processing unit 304 or the BC adjustment unit 305 adjusts the ROI in the high-quality medical image so that the difference between the pixel value of the ROI and the pixel value of the area other than the ROI widens. Apply a second image processing. At this time, the blend processing unit 304 or the BC adjustment unit 305 adjusts the ROI in the high-quality medical image so that the pixel value of the ROI is lower than the pixel value of the ROI before the second image processing. 2 image processing is applied. For example, the blend processing unit 304 performs the second image processing on the ROI in the high-quality medical image, the medical image acquired in step S701 and the high-quality medical image acquired in step S702. Blend processing is performed. The blending process may be performed using any known method, and the blending ratio may be a predetermined ratio, or may be set according to an operator's instruction. By performing the blending process, the image before overcorrection and the image with overcorrection are blended for the area where overcorrection has occurred due to the image quality improvement process using the trained model, and the overcorrection is performed. The difference between the pixel values of the area where the . In other words, by performing the blending process, the pixel values of the overcorrected area are lowered. In such processing, pixel values can be reduced only in areas where overcorrection has occurred without reducing the pixel values of the entire image. Therefore, for example, it is possible to suppress overcorrection in areas such as NPA and FAZ.

また、ＢＣ調整部３０５は、高画質化された医用画像におけるＲＯＩに対して、第２の画像処理としてＢＣ調整処理を行う。ＢＣ調整処理は、公知の任意の手法を用いて行われてよく、例えば、ＲＯＩにおける明るさを所定の値に設定したり、所定の値だけ増減させたりしてよい。また、ＲＯＩにおいて、例えばトーンカーブやガンマカーブ等を用いてコントラストの調整を行ってもよい。さらに、ＢＣ調整処理は、ＲＯＩにおける明るさ及びコントラストの少なくとも一方を操作者の指示に応じた値に設定したり、当該値だけ増減させたりしてもよい。例えば、ＢＣ調整部３０５は、操作者の指示に応じて、明るさの補正値を決定したり、コントラスト調整に用いるトーンカーブ等の設定値を決定したりしてもよい。当該ＢＣ調整処理が行われることで、学習済モデルを用いた高画質化処理による過補正が生じている領域について、過補正が生じている領域の画素値とそれ以外の領域の画素値との差が広がるように、明るさ及びコントラストの少なくとも一方が調整される。言い換えると、当該ＢＣ調整処理が行われることで、過補正が生じている領域の画素値が低くなる。このような処理では、画像全体の画素値を低くすることなく、過補正が生じている領域のみ画素値を低くすることができる。このため、例えば、ＮＰＡやＦＡＺ等の領域における過補正を抑制することができる。 In addition, the BC adjustment unit 305 performs BC adjustment processing as second image processing on the ROI in the high-quality medical image. The BC adjustment process may be performed using any known technique, for example, the brightness in the ROI may be set to a predetermined value or increased/decreased by a predetermined value. Also, in the ROI, the contrast may be adjusted using, for example, a tone curve, a gamma curve, or the like. Furthermore, the BC adjustment process may set at least one of the brightness and contrast in the ROI to a value according to the operator's instruction, or may increase or decrease the value by that value. For example, the BC adjustment unit 305 may determine a brightness correction value or set values such as a tone curve used for contrast adjustment according to an operator's instruction. By performing the BC adjustment process, the pixel values of the over-corrected area and the pixel values of the other areas are corrected for the area where the over-correction is caused by the image quality improvement process using the learned model. At least one of brightness and contrast is adjusted to widen the difference. In other words, by performing the BC adjustment process, the pixel value of the area where overcorrection occurs is lowered. In such processing, pixel values can be reduced only in areas where overcorrection has occurred without reducing pixel values for the entire image. Therefore, for example, it is possible to suppress overcorrection in areas such as NPA and FAZ.

本実施形態では、ブレンド処理部３０４は、高画質化されたＯＣＴＡ正面画像におけるＲＯＩとして設定されたＮＰＡに対して、ＯＣＴＡ正面画像及び高画質化されたＯＣＴＡ正面画像のブレンド処理を行う。また、ＢＣ調整部３０５は、高画質化されたＯＣＴＡ正面画像におけるＲＯＩとして設定されたＮＰＡに対して、高画質化されたＯＣＴＡ正面画像のＢＣ調整処理を行う。なお、前述のように、ブレンド処理部３０４による処理及びＢＣ調整部３０５による処理はいずれか一方のみ行われてもよいし、両方行われてもよい。ブレンド処理部３０４やＢＣ調整部３０５によって第２の画像処理が施された高画質画像は、表示制御部１０１－５によって出力部１０３に表示されたり、出力部１０３によって外部装置等に出力されたり、記憶部１０１－３や外部記憶装置１０２に記憶されてよい。 In this embodiment, the blend processing unit 304 performs blend processing of the OCTA frontal image and the high-quality OCTA frontal image on the NPA set as the ROI in the high-quality OCTA frontal image. Further, the BC adjustment unit 305 performs BC adjustment processing of the OCTA frontal image with improved image quality for the NPA set as the ROI in the OCTA frontal image with improved image quality. As described above, either one of the processing by the blend processing unit 304 and the processing by the BC adjustment unit 305 may be performed, or both may be performed. The high-quality image subjected to the second image processing by the blend processing unit 304 and the BC adjustment unit 305 is displayed on the output unit 103 by the display control unit 101-5, or is output to an external device or the like by the output unit 103. , the storage unit 101-3 or the external storage device 102. FIG.

このように本実施形態に係る画像処理を行うことで、学習済モデルを用いた医用画像の画像処理による過補正が生じる領域を検出し、検出された領域に対して当該過補正を低減する画像処理を施すことができる。例えば、ＯＣＴＡ正面画像におけるＮＰＡを検出し、学習済モデルを用いて取得された高画質なＯＣＴＡ正面画像におけるＮＰＡについて過補正を低減するブレンド処理やＢＣ調整処理を施すことができる。 By performing image processing according to the present embodiment in this way, an area in which overcorrection occurs due to image processing of a medical image using a trained model is detected, and an image in which the overcorrection is reduced for the detected area is obtained. can be treated. For example, it is possible to detect the NPA in the OCTA frontal image and perform blending processing and BC adjustment processing to reduce overcorrection for the NPA in the high-quality OCTA frontal image acquired using the trained model.

前述のように、眼底画像を診断する上でＮＰＡは特に重要である。ＮＰＡを特定することで、血管があるべきところに血流がない、あるいは血管がないはずのところに何らかの血流が認められるか（新生血管など）を判断することができる。特にＦＡＺやＮＰＡに関しては、高画質化前の元の状態をある程度残すことで、画像診断において、ノイズか血管かの区別に関する操作者（検者）の判断を支援することができる。 As mentioned above, NPA is of particular importance in diagnosing fundus images. By identifying NPA, it is possible to determine whether there is no blood flow where there should be blood vessels, or whether there is some blood flow where there should be no blood vessels (such as neovascularization). In particular, regarding FAZ and NPA, by leaving the original state before image quality enhancement to some extent, it is possible to assist the operator (examiner)'s judgment regarding discrimination between noise and blood vessels in image diagnosis.

一方で、ＮＰＡに対して、明るさやコントラストを所望の状態にしてノイズなどを積極的に除去することで、よりＮＰＡらしくすることで視認性を向上させることもできる。この場合にも、画像診断において、ノイズか血管かの区別に関する操作者の判断を支援することができる。 On the other hand, by setting the brightness and contrast to a desired state and actively removing noise and the like from the NPA, it is possible to make it more NPA-like and improve the visibility. In this case as well, in image diagnosis, it is possible to assist the operator's judgment regarding the distinction between noise and blood vessels.

なお、一般的には、他の検査も含めて複合的に診断が行われるため、第２の画像処理方法は予め設定してもよいし、適宜選択できるようにしてもよい。また、分類する領域によって第２の画像処理方法を変更できるようにしてもよい。さらに、被検眼によって設定を変更することも可能である。また、ＯＣＴＡ正面画像の深さ、すなわち浅層と深層で設定を変更できるようにしてもよい。これらの設定は、ユーザー又は被検者ごとに設定を記憶しておいてもよい。 In addition, since diagnosis is generally performed in combination with other examinations, the second image processing method may be set in advance or may be selected as appropriate. Also, the second image processing method may be changed depending on the area to be classified. Furthermore, it is also possible to change the setting depending on the eye to be examined. Also, the depth of the OCTA front image, that is, the setting may be changed between the shallow layer and the deep layer. These settings may be stored for each user or subject.

上記のように、本実施形態に係る画像処理装置１０１は、高画質化処理部３０１と、領域検出部３０２と、ブレンド処理部３０４と、ＢＣ調整部３０５とを備える。高画質化処理部３０１は、被検体の医用画像を学習データとした学習により得た高画質化モデル（高画質化エンジン）を用いて、被検体の医用画像の第１の画像に対して高画質化処理を行い、被検体の医用画像の高画質な第２の画像を取得する。領域検出部３０２は、被検体の医用画像（第１の画像又は第２の画像）における対象領域を検出する。ブレンド処理部３０４及びＢＣ調整部３０５の少なくとも一方は、第２の画像における対象領域に対して、対象領域の画素値と対象領域以外の領域の画素値との差が広がるように、且つ対象領域の画素値が画像処理前の対象領域の画素値より低くなるように画像処理を行う。 As described above, the image processing apparatus 101 according to this embodiment includes the image quality enhancement processing unit 301 , the area detection unit 302 , the blend processing unit 304 and the BC adjustment unit 305 . The image quality enhancement processing unit 301 uses an image quality enhancement model (image quality enhancement engine) obtained by learning using the medical image of the subject as learning data to perform high quality processing on the first image of the medical image of the subject. An image quality improvement process is performed to obtain a second high-quality medical image of the subject. The region detection unit 302 detects a target region in the medical image (first image or second image) of the subject. At least one of the blend processing unit 304 and the BC adjustment unit 305 adjusts the target region in the second image such that the difference between the pixel values of the target region and the pixel values of the regions other than the target region increases. is lower than the pixel value of the target area before image processing.

ブレンド処理部３０４は、画像処理として、対象領域の画素値と対象領域以外の領域の画素値との差が広がるように且つ対象領域の画素値がより低くなるように第１の画像と第２の画像とをブレンドするブレンド処理を行う。また、ＢＣ調整部３０５は、画像処理として、対象領域の画素値と対象領域以外の領域の画素値との差が広がるように且つ対象領域の画素値がより低くなるように明るさ及びコントラストの少なくとも一方を補正するＢＣ調整処理を行う。なお、被検体の医用画像は、例えば被検眼のモーションコントラスト画像とし、対象領域は例えば無灌流領域、中心窩血管領域、及び視神経乳頭領域の少なくとも１つを含むことができる。また、領域検出部３０２は、被検体の医用画像を学習データとした学習により得た学習済モデルを用いて、対象領域を検出することができる。 As image processing, the blend processing unit 304 combines the first image and the second image so that the difference between the pixel values of the target region and the pixel values of the regions other than the target region widens and the pixel values of the target region become lower. Blend processing is performed to blend with the image of In addition, as image processing, the BC adjustment unit 305 adjusts the brightness and contrast so that the difference between the pixel values of the target region and the pixel values of the regions other than the target region widens and the pixel values of the target region become lower. BC adjustment processing is performed to correct at least one of them. The medical image of the subject can be, for example, a motion contrast image of the subject's eye, and the target region can include, for example, at least one of a no-perfusion region, a foveal vessel region, and an optic disc region. In addition, the region detection unit 302 can detect a target region using a learned model obtained by learning using a medical image of a subject as learning data.

このような構成により、本実施形態に係る画像処理装置は、学習済モデルを用いた医用画像の画像処理による過補正が生じるＮＰＡ等の領域を検出し、検出された領域に対して当該過補正を低減する画像処理を施すことができる。このような処理では、画像全体の画素値を低くすることなく、過補正が生じている領域のみ画素値を低くすることができる。このため、学習済モデルを用いた医用画像の画像処理による過補正を低減することができる。これにより、画像診断において、ノイズか血管かの区別に関する操作者の判断を支援することができる。 With such a configuration, the image processing apparatus according to the present embodiment detects an area such as an NPA where overcorrection occurs due to image processing of a medical image using a trained model, and performs the overcorrection on the detected area. image processing can be applied to reduce the In such processing, pixel values can be reduced only in areas where overcorrection has occurred without reducing pixel values for the entire image. Therefore, overcorrection due to image processing of medical images using a trained model can be reduced. Accordingly, in image diagnosis, it is possible to assist the operator's judgment regarding the distinction between noise and blood vessels.

ここで、本実施形態に係る一連の画像処理に関しては、種々の変形が可能である。以下、図８乃至図１３を参照して、本実施形態に係る一連の画像処理の具体的な例について詳細に説明する。なお、各例について、図７に示す前述した画像処理と同様の処理に関しては説明を省略する。 Various modifications are possible for the series of image processing according to the present embodiment. A specific example of a series of image processing according to this embodiment will be described in detail below with reference to FIGS. 8 to 13 . For each example, the description of the same processing as the above-described image processing shown in FIG. 7 will be omitted.

＜画像処理方法の第１の例＞
画像処理方法の第１の例について、図８を参照して説明する。本例ではＦＡＺやＮＰＡに対する高画質化エンジンによる過補正を抑制するための画像処理方法について説明する。 <First example of image processing method>
A first example of the image processing method will be described with reference to FIG. In this example, an image processing method for suppressing overcorrection by the image quality improvement engine for FAZ and NPA will be described.

ステップＳ８０１及びステップＳ８０２における処理は、ステップＳ７０１及びステップＳ７０２と同様であるため、説明を省略する。ステップＳ８０２において高画質化された医用画像が取得されると処理はステップＳ８０３に移行する。 Since the processes in steps S801 and S802 are the same as those in steps S701 and S702, description thereof is omitted. In step S802, when the high-quality medical image is obtained, the process proceeds to step S803.

ステップＳ８０３では、領域検出部３０２が、領域検出エンジンを用いて、ステップＳ８０１で取得した医用画像（入力画像）におけるＦＡＺ及びＮＰＡの領域を検出する。ＦＡＺ及びＮＰＡはそれぞれ区別して検出してもよいし、１つの領域としてまとめて検出してもよい。また、いずれか一方の領域だけを検出してもよいし、ＦＡＺ及びＮＰＡ以外の領域は灌流領域として検出してもよい。領域検出部３０２が検出した領域を示す情報を出力したら、ＲＯＩ設定部３０３が、当該出力された情報に基づいて、ステップＳ８０２において高画質化された医用画像におけるＲＯＩを設定する。なお、医用画像における検出された領域の画素位置は、医用画像及び高画質化された医用画像において対応しているものとすることができる。 In step S803, the area detection unit 302 uses the area detection engine to detect the FAZ and NPA areas in the medical image (input image) acquired in step S801. FAZ and NPA may be detected separately, or may be detected collectively as one region. Alternatively, only one of the regions may be detected, or regions other than the FAZ and NPA may be detected as perfusion regions. After the region detection unit 302 outputs the information indicating the detected region, the ROI setting unit 303 sets the ROI in the high-quality medical image in step S802 based on the output information. Note that the pixel positions of the detected regions in the medical image can correspond to each other in the medical image and the high-quality medical image.

最後にステップＳ８０４において、ブレンド処理部３０４が、入力画像とステップＳ８０２で取得された高画質化された医用画像を用いて、高画質化された医用画像における検出された領域に対してブレンド処理を行う。ここで、ブレンド処理は既知のαブレンド処理などを用いればよい。また、ブレンド比率は、固定値でもよいし、領域検出エンジンが出力した属性情報に関する信頼度（確からしさ、確率）に応じて、ブレンド処理部３０４が各画素におけるブレンド比率を変更してもよい。具体的には、無灌流領域の属性情報を持つ画素に対しては、信頼度が高いほど入力画像のブレンド比率が高くなるようにαブレンド比率を設定することができる。 Finally, in step S804, the blend processing unit 304 uses the input image and the high-quality medical image acquired in step S802 to perform blend processing on the detected region in the high-quality medical image. conduct. Here, the known alpha blending process or the like may be used for the blending process. Also, the blend ratio may be a fixed value, or the blend processing unit 304 may change the blend ratio for each pixel according to the reliability (likelihood, probability) of the attribute information output by the area detection engine. Specifically, for pixels having attribute information of non-perfused regions, the α blend ratio can be set such that the higher the reliability, the higher the blend ratio of the input image.

また、ＦＡＺ及びＮＰＡ等の属性情報を区別して検出している場合には、区別されている属性情報に応じてそれぞれのブレンド比率を変えてもよい。さらに、ＦＡＺ及びＮＰＡ以外の領域、例えば、ＦＡＺやＮＰＡ等の対象領域の周辺の画素に対しても、入力画像を適度に合成してもよい。これによって、領域境界での急峻な変化を緩和することができる。 In addition, when attribute information such as FAZ and NPA is detected separately, the respective blend ratios may be changed according to the distinguished attribute information. Furthermore, the input image may be appropriately combined with pixels around the target area such as FAZ and NPA, for example, areas other than FAZ and NPA. This makes it possible to moderate sharp changes at the boundary of the area.

なお、ブレンド比率に関しては、ＵＩを介して操作者が指示できるようにしてもよい。例えば、スライドバーなどのＧＵＩを用いて過補正の抑制強度に対応するブレンド比率を設定できるようにしてもよい。 Note that the blend ratio may be instructed by the operator via the UI. For example, a GUI such as a slide bar may be used to set the blend ratio corresponding to the suppression intensity of overcorrection.

本例によれば、ＦＡＺやＮＰＡに関しては、高画質化前の元の状態をある程度残すことできる。そのため、画像診断において、ノイズか血管かの区別に関する操作者の判断を支援することができる。 According to this example, with respect to FAZ and NPA, the original state before image quality improvement can be left to some extent. Therefore, in image diagnosis, it is possible to assist the operator's judgment regarding discrimination between noise and blood vessels.

なお、入力画像である医用画像（第１の画像）における対象領域を検出する場合には、領域検出部３０２は、領域検出エンジンとして、学習済モデルにより構成された領域検出エンジンを用いることができる。この場合、領域検出エンジンの学習データとしては、入力画像である医用画像を入力データとし、入力画像である医用画像における各領域のラベルを示す情報を画素値として有する領域ラベル画像を出力データとすることができる。また、領域検出部３０２は、被検体の構造に基づくルールベースのアルゴリズム等を用いて構成された領域検出エンジンを用いることもできる。この場合にも、領域検出部３０２は、学習済モデルを用いた高画質化が行われていない、言い換えると、過補正が生じていない医用画像における特徴部から、ＦＡＺ及びＮＰＡ等の対象領域を検出することができる。 When detecting a target region in a medical image (first image) that is an input image, the region detection unit 302 can use a region detection engine configured by a trained model as the region detection engine. . In this case, as learning data for the region detection engine, a medical image, which is an input image, is used as input data, and a region label image having, as a pixel value, information indicating the label of each region in the medical image, which is an input image, is used as output data. be able to. The region detection unit 302 can also use a region detection engine configured using a rule-based algorithm or the like based on the structure of the subject. In this case as well, the region detection unit 302 detects target regions such as FAZ and NPA from the characteristic portions of the medical image in which image quality has not been improved using the trained model, in other words, in which overcorrection has not occurred. can be detected.

＜画像処理方法の第２の例＞
画像処理方法の第２の例について、図９を参照して説明する。本例では、ＮＰＡに対する高画質化処理を強調するための画像処理方法について説明する。 <Second example of image processing method>
A second example of the image processing method will be described with reference to FIG. In this example, an image processing method for emphasizing image quality improvement processing for NPA will be described.

領域検出エンジンの性能が高い場合には、領域検出部３０２は、ＦＡＺやＮＰＡ等の領域をより適切には検出することができる。例えば、ＦＡＺ及びＮＰＡでは、本来血管がないのが正しい。そのため、領域検出エンジンの性能が高い場合には、これらの領域に対しては、高画質化エンジンで処理された後に、さらにＢＣ調整でより暗くすることでより診断に適した画像が得られる。このような画像処理を用いてより診断に適した画像を得ることができる本例の処理を以下でより詳細に説明する。 If the performance of the area detection engine is high, the area detection unit 302 can more appropriately detect areas such as FAZ and NPA. For example, FAZ and NPA are naturally devoid of blood vessels. Therefore, if the performance of the region detection engine is high, these regions are processed by the image quality enhancement engine and then further darkened by BC adjustment to obtain an image more suitable for diagnosis. The processing of this example, which can obtain images more suitable for diagnosis using such image processing, will be described in more detail below.

ステップＳ９０１及びステップＳ９０２における処理は、ステップＳ７０１及びステップＳ７０２と同様であるため、説明を省略する。ステップＳ９０２において高画質化された医用画像が取得されると処理はステップＳ９０３に移行する。 Since the processes in steps S901 and S902 are the same as those in steps S701 and S702, description thereof is omitted. When the high-quality medical image is obtained in step S902, the process proceeds to step S903.

ステップＳ９０３では、領域検出部３０２が、領域検出エンジンを用いて、ステップＳ９０１で取得した医用画像（入力画像）におけるＦＡＺ及びＮＰＡの領域を検出する。ＦＡＺ及びＮＰＡはそれぞれ区別して検出してもよいし、１つの領域としてまとめても検出してもよい。また、いずれか一方の領域だけを検出してもよいし、ＦＡＺ及びＮＰＡ以外の領域を灌流領域として検出してもよい。領域検出部３０２が検出した領域を示す情報を出力したら、ＲＯＩ設定部３０３が、当該出力された情報に基づいて、ステップＳ９０２において高画質化された医用画像におけるＲＯＩを設定する。なお、医用画像における検出された領域の画素位置は、医用画像及び高画質化された医用画像において対応しているものとすることができる。 In step S903, the area detection unit 302 uses the area detection engine to detect the FAZ and NPA areas in the medical image (input image) acquired in step S901. FAZ and NPA may be detected separately, or may be collected and detected as one region. Alternatively, only one of the regions may be detected, or regions other than the FAZ and NPA may be detected as perfusion regions. After the region detection unit 302 outputs the information indicating the detected region, the ROI setting unit 303 sets the ROI in the high-quality medical image in step S902 based on the output information. Note that the pixel positions of the detected regions in the medical image can correspond to each other in the medical image and the high-quality medical image.

最後にステップＳ９０４において、ＢＣ調整部３０５が、ステップＳ９０２で高画質化エンジンを用いて取得された高画質化された医用画像における検出された領域（ＦＡＺ及びＮＰＡの領域）の画素についてＢＣ調整処理を行う。ＢＣ調整は、画素値に関する輝度値（明るさ）をより暗くするように補正すればよい。例えば、単純に輝度値を下げてもよいし、ゼロにしてもよい。また、ガンマ補正のような、元の画素値に応じた画像処理を適用してもよい。なお、補正の強さはＦＡＺとＮＰＡでそれぞれ変更してもよいし、検出した領域を示す情報として属性と信頼度が保持されていれば、信頼度に応じてＢＣ調整を行ってもよい。 Finally, in step S904, the BC adjustment unit 305 performs BC adjustment processing on pixels in the detected regions (FAZ and NPA regions) in the high-quality medical image acquired using the image quality enhancement engine in step S902. I do. The BC adjustment may be performed so as to darken the luminance value (brightness) of the pixel value. For example, the luminance value may simply be lowered or set to zero. Also, image processing according to the original pixel values, such as gamma correction, may be applied. Note that the strength of correction may be changed between FAZ and NPA, respectively, and if the attribute and reliability are held as information indicating the detected area, BC adjustment may be performed according to the reliability.

なお、ＢＣ調整の方法及び強度は、ＵＩを介して操作者が指示できるようにしてもよい。例えば、ラジオボタンなどのＧＵＩで方法を選択できるようにし、また、スライドバーなどのＧＵＩで強度を指定してもよい。 The method and intensity of BC adjustment may be instructed by the operator via the UI. For example, a GUI such as a radio button may be used to select the method, and a GUI such as a slide bar may be used to specify the intensity.

本例によれば、ＮＰＡ及びＦＡＺ等の領域の視認性を向上させることもできる。この場合にも、画像診断において、ノイズか血管かの区別に関する医師の判断を支援することができる。さらに、第１の例で示した抑制処理と第２の例で示した強調処理を領域の属性ごとに選択できるようにしてもよい。例えば、ＦＡＺは強調処理、ＮＰＡは抑制処理としてもよい。この場合、各領域に応じたより診断に適した画像を生成することができ、医師の判断を支援することができる。 According to this example, it is also possible to improve the visibility of areas such as the NPA and FAZ. In this case as well, it is possible to assist the doctor's judgment regarding the distinction between noise and blood vessels in image diagnosis. Furthermore, the suppression process shown in the first example and the enhancement process shown in the second example may be selected for each attribute of the area. For example, FAZ may be emphasized and NPA may be suppressed. In this case, it is possible to generate an image more suitable for diagnosis according to each region, and to assist the doctor's judgment.

＜画像処理方法の第３の例＞
画像処理方法の第３の例について、図１０を参照して説明する。本例では、高画質化エンジンを用いて高画質化した医用画像から、領域検出エンジンを用いてＦＡＺ及びＮＰＡ等の領域として検出し、それらの領域における過補正を抑制する画像処理について説明する。 <Third example of image processing method>
A third example of the image processing method will be described with reference to FIG. In this example, image processing for detecting areas such as FAZ and NPA using an area detection engine from a medical image whose image quality has been enhanced using an image quality enhancement engine and suppressing overcorrection in these areas will be described.

ステップＳ１００１及びステップＳ１００２における処理は、ステップＳ７０１及びステップＳ７０２と同様であるため、説明を省略する。ステップＳ１００２において高画質化された医用画像が取得されると処理はステップＳ１００３に移行する。 Since the processing in steps S1001 and S1002 is the same as that in steps S701 and S702, the description thereof is omitted. In step S1002, when the high-quality medical image is obtained, the process proceeds to step S1003.

ステップＳ１００３では、領域検出部３０２が、領域検出エンジンを用いて、Ｓ１００２で高画質化された医用画像におけるＦＡＺ及びＮＰＡの領域を検出する。ＦＡＺ及びＮＰＡはそれぞれ区別して検出してもよいし、１つの領域としてまとめて検出してもよい。また、いずれか一方の領域だけを検出してもよいし、ＦＡＺ及びＮＰＡ以外の領域は灌流領域として検出してもよい。領域検出部３０２が検出した領域を示す情報を出力したら、ＲＯＩ設定部３０３が、当該出力された情報に基づいて、ステップＳ１００２において高画質化された医用画像におけるＲＯＩを設定する。 In step S1003, the region detection unit 302 uses the region detection engine to detect the FAZ and NPA regions in the medical image whose image quality has been improved in S1002. FAZ and NPA may be detected separately from each other, or may be detected collectively as one region. Alternatively, only one of the regions may be detected, or regions other than the FAZ and NPA may be detected as perfusion regions. After the area detection unit 302 outputs the information indicating the detected area, the ROI setting unit 303 sets the ROI in the high-quality medical image in step S1002 based on the output information.

最後にステップＳ１００４において、ブレンド処理部３０４が、高画質化された医用画像とステップＳ１００１で取得された医用画像を用いて、高画質化された医用画像における検出された領域に対してブレンド処理を行う。ブレンド処理に関しては、第１の例で説明した処理と同様の処理であってよい。 Finally, in step S1004, the blend processing unit 304 uses the high-quality medical image and the medical image acquired in step S1001 to perform blend processing on the detected region in the high-quality medical image. conduct. The blending process may be the same as the process described in the first example.

本例の場合も、第１の例と同様に、ＦＡＺやＮＰＡに関しては、高画質化前の元の状態をある程度残すことできる。そのため、画像診断において、ノイズか血管かの区別に関する操作者の判断を支援することができる。 In the case of this example, as in the first example, the original state before image quality improvement can be left to some extent with respect to FAZ and NPA. Therefore, in image diagnosis, it is possible to assist the operator's judgment regarding the distinction between noise and blood vessels.

なお、高画質化された医用画像における対象領域を検出する場合には、領域検出部３０２は、学習済モデルにより構成された領域検出エンジンを用いることができる。この場合、領域検出エンジンの学習データとしては、高画質化された医用画像（第２の画像）を入力データとし、入力画像である医用画像（第１の画像）における各領域のラベルを示す情報を画素値として有する領域ラベル画像を出力データとすることができる。ここで、入力データとなる高画質化された医用画像は、高画質化エンジンを用いて高画質化された医用画像であり、ＮＰＡ等の対象領域において過補正が生じている高画質画像とすることができる。これに対して、出力データとなる領域ラベル画像は、高画質化エンジンを用いて高画質化される前の医用画像についてラベル付けを行って得た画像であってよい。この場合には、学習済モデルにより構成された領域検出エンジンは学習の傾向に従って、入力された高画質画像から、過補正が生じている対象領域のラベルを含む領域ラベル画像を出力することができる。なお、高画質画像を入力として用いることで、学習済モデルにより構成された領域検出エンジンは、より適切に画像内の特徴を抽出することができ、より精度の高い領域ラベル画像を出力できると期待される。 Note that when detecting a target region in a high-quality medical image, the region detection unit 302 can use a region detection engine configured by a trained model. In this case, as learning data for the region detection engine, a high-quality medical image (second image) is used as input data, and information indicating the label of each region in the medical image (first image) that is the input image. can be used as output data. Here, the high-quality medical image that is the input data is a medical image that has been high-quality enhanced using an image-quality enhancement engine, and is a high-quality image in which overcorrection occurs in a target area such as NPA. be able to. On the other hand, the region label image to be output data may be an image obtained by labeling a medical image before image quality enhancement using the image quality enhancement engine. In this case, the area detection engine configured by the trained model can output an area label image including the label of the overcorrected target area from the input high-quality image according to the tendency of learning. . By using high-quality images as input, it is expected that the region detection engine configured with trained models will be able to extract features in images more appropriately, and output region label images with higher accuracy. be done.

＜画像処理方法の第４の例＞
画像処理方法の第４の例について、図１１を参照して説明する。本例では、高画質化エンジンを用いて高画質化した医用画像から、領域検出エンジンを用いてＦＡＺ及びＮＰＡ等の領域を検出し、それらの領域の高画質化処理を強調する画像処理について説明する。 <Fourth example of image processing method>
A fourth example of the image processing method will be described with reference to FIG. In this example, image processing that detects areas such as FAZ and NPA using an area detection engine from a medical image that has been enhanced using the image quality enhancement engine and emphasizes image quality enhancement processing for these areas will be described. do.

ステップＳ１１０１及びステップＳ１１０２における処理は、ステップＳ７０１及びステップＳ７０２と同様であるため、説明を省略する。ステップＳ１１０２において高画質化された医用画像が取得されると処理はステップＳ１１０３に移行する。 Since the processing in steps S1101 and S1102 is the same as that in steps S701 and S702, description thereof will be omitted. When the high-quality medical image is obtained in step S1102, the process proceeds to step S1103.

ステップＳ１１０３では、領域検出部３０２が、領域検出エンジンを用いて、ステップＳ１１０２で高画質化された医用画像におけるＦＡＺ及びＮＰＡの領域を検出する。ＦＡＺ及びＮＰＡはそれぞれ区別して検出してもよいし、１つの領域としてまとめて検出してもよい。また、いずれか一方の領域だけを検出してもよいし、ＦＡＺ及びＮＰＡ以外の領域は灌流領域として検出してもよい。領域検出部３０２が検出した領域を示す情報を出力したら、ＲＯＩ設定部３０３が、当該出力された情報に基づいて、ステップＳ１１０２において高画質化された医用画像におけるＲＯＩを設定する。 In step S1103, the region detection unit 302 uses the region detection engine to detect the FAZ and NPA regions in the medical image whose image quality has been enhanced in step S1102. FAZ and NPA may be detected separately, or may be detected collectively as one region. Alternatively, only one of the regions may be detected, or regions other than the FAZ and NPA may be detected as perfusion regions. After the region detection unit 302 outputs the information indicating the detected region, the ROI setting unit 303 sets the ROI in the high-quality medical image in step S1102 based on the output information.

最後にステップＳ１１０４において、ＢＣ調整部３０５が、高画質化された医用画像における検出された領域（ＦＡＺやＮＰＡの領域）の画素についてＢＣ調整処理を行う。ＢＣ調整処理は、第２の例で説明した処理と同様の処理であってよい。 Finally, in step S1104, the BC adjustment unit 305 performs BC adjustment processing on pixels in the detected area (FAZ or NPA area) in the high-quality medical image. The BC adjustment process may be the same process as the process described in the second example.

本例の場合も、第２の例と同様に、ＮＰＡ及びＦＡＺ等の領域の視認性を向上させることもできる。この場合にも、画像診断において、ノイズか血管かの区別に関する医師の判断を支援することができる。さらに、第１の例や第３の例で示した抑制処理と第４の例で示した強調処理を領域の属性ごとに選択できるようにしてもよい。例えば、ＦＡＺは強調処理、ＮＰＡは抑制処理としてもよい。この場合、各領域に応じたより診断に適した画像を生成することができ、医師の判断を支援することができる。 In the case of this example as well, the visibility of the regions such as the NPA and FAZ can be improved as in the second example. In this case as well, it is possible to support the doctor's judgment regarding the distinction between noise and blood vessels in image diagnosis. Furthermore, the suppression processing shown in the first and third examples and the enhancement processing shown in the fourth example may be selected for each attribute of the area. For example, FAZ may be emphasized and NPA may be suppressed. In this case, it is possible to generate an image more suitable for diagnosis according to each region, and to assist the doctor's judgment.

＜画像処理方法の第５の例＞
画像処理方法の第５の例について、図１２を参照して説明する。本例では、高画質化する前の医用画像について領域検出エンジンを用いてＦＡＺ及びＮＰＡ等の領域を検出し、それらの領域に対してＢＣ調整を行った上で、高画質化エンジンを用いた高画質化処理を適用する画像処理について説明する。 <Fifth example of image processing method>
A fifth example of the image processing method will be described with reference to FIG. In this example, regions such as FAZ and NPA are detected using a region detection engine for a medical image before image quality enhancement, and BC adjustment is performed on these regions, and then the image quality enhancement engine is used. Image processing that applies image quality enhancement processing will be described.

ステップＳ１２０１における処理は、ステップＳ７０１と同様であるため、説明を省略する。ステップＳ１２０２では、領域検出部３０２が、領域検出エンジンを用いて、ステップＳ１２０１で取得された医用画像（入力画像）におけるＦＡＺ及びＮＰＡの領域を検出する。領域検出部３０２が検出した領域を示す情報を出力したら、ＲＯＩ設定部３０３が、当該出力された情報に基づいて、入力画像におけるＲＯＩを設定する。 Since the processing in step S1201 is the same as that in step S701, description thereof is omitted. In step S1202, the area detection unit 302 uses the area detection engine to detect the FAZ and NPA areas in the medical image (input image) acquired in step S1201. After the area detection unit 302 outputs the information indicating the detected area, the ROI setting unit 303 sets the ROI in the input image based on the output information.

次に、ステップＳ１２０３において、ＢＣ調整部３０５が、入力画像における検出された領域（ＦＡＺやＮＰＡの領域）の画素についてＢＣ調整処理を行う。なお、ＢＣ調整処理は、第２の例で説明した処理と同様の処理であってよい。 Next, in step S1203, the BC adjustment unit 305 performs BC adjustment processing on pixels in the detected area (the FAZ or NPA area) in the input image. Note that the BC adjustment process may be the same process as the process described in the second example.

最後にステップＳ１２０４において、高画質化処理部３０１が、高画質化エンジンを用いて、ステップＳ１２０３においてＢＣ調整処理が行われた医用画像に対して高画質化処理を実行し、高画質化された医用画像を取得する。なお、高画質化処理部３０１による高画質化処理後の医用画像に対して、検出された領域に基づくブレンド処理やＢＣ調整処理をさらに加えてもよい。ブレンド処理やＢＣ調整処理はそれぞれ第１の例及び第２の例で説明した処理と同様の処理であってよい。 Finally, in step S1204, the image quality enhancement processing unit 301 uses the image quality enhancement engine to perform image quality enhancement processing on the medical image on which the BC adjustment processing has been performed in step S1203. Acquire medical images. Blend processing and BC adjustment processing based on the detected area may be further added to the medical image after the image quality enhancement processing by the image quality enhancement processing unit 301 . The blending process and the BC adjustment process may be similar to the processes described in the first and second examples, respectively.

上記のように、本例では、領域検出部３０２は、被検体の医用画像の第１の画像における対象領域を検出する。また、ＢＣ調整部３０５は、被検体の医用画像の第１の画像における対象領域に対して、対象領域の画素値と対象領域以外の領域の画素値との差が広がるように且つ対象領域の画素値がＢＣ調整処理前の対象領域の画素値より低くなるようにＢＣ調整処理を行う。さらに、高画質化処理部３０１は、被検体の医用画像を学習データとした学習により得た高画質化モデルを用いて、ＢＣ調整処理が行われた第１の画像に対して高画質化処理を行い、被検体の医用画像の高画質な第２の画像を取得する。 As described above, in this example, the region detection unit 302 detects the target region in the first medical image of the subject. In addition, the BC adjusting unit 305 adjusts the target region in the first medical image of the subject so that the difference between the pixel values of the target region and the pixel values of the regions other than the target region widens, and The BC adjustment process is performed so that the pixel value is lower than the pixel value of the target area before the BC adjustment process. Furthermore, the image quality improvement processing unit 301 performs image quality improvement processing on the first image that has undergone BC adjustment processing, using an image quality improvement model obtained by learning using medical images of the subject as learning data. to obtain a high-quality second medical image of the subject.

本例の場合には、ＢＣ調整処理が行われることで、過補正が生じる領域の画素値がより低くされた医用画像について、高画質化処理が行われる。このような処理では、画像全体の画素値を低くすることなく、過補正が生じる領域のみ画素値を低くすることができる。このため、例えば、ＮＰＡやＦＡＺ等の領域における過補正を抑制することができ、ＮＰＡ及びＦＡＺ等の領域の視認性を向上させることもできる。この場合にも、画像診断において、ノイズか血管かの区別に関する医師の判断を支援することができる。 In the case of this example, the BC adjustment process is performed to perform the image quality improvement process on the medical image in which the pixel values of the region where overcorrection occurs are lowered. In such processing, the pixel values can be lowered only in areas where overcorrection occurs without lowering the pixel values of the entire image. Therefore, for example, overcorrection in areas such as NPA and FAZ can be suppressed, and the visibility of areas such as NPA and FAZ can be improved. In this case as well, it is possible to assist the doctor's judgment regarding the distinction between noise and blood vessels in image diagnosis.

＜画像処理方法の第６の例＞
画像処理方法の第６の例について、図１３を参照して説明する。本例では、高画質化エンジンを用いて高画質化された医用画像に対して、領域検出エンジンを用いて領域検出を行う。その後、高画質化される前の医用画像について、前述の領域検出によって検出された領域に基づくＢＣ調整を行った上で、さらに高画質化エンジンを用いた高画質化処理を適用する画像処理について説明する。 <Sixth example of image processing method>
A sixth example of the image processing method will be described with reference to FIG. In this example, the area detection engine is used to perform area detection on a medical image whose image quality has been enhanced using the image quality enhancement engine. After that, regarding the medical image before the image quality enhancement, BC adjustment is performed based on the area detected by the above-described area detection, and then image processing that applies image quality enhancement processing using the image quality enhancement engine. explain.

ステップＳ１３０１及びステップＳ１３０２における処理は、ステップＳ７０１及びステップＳ７０２と同様であるため、説明を省略する。ステップＳ１３０２において高画質化された医用画像が取得されると処理はステップＳ１３０３に移行する。 Since the processes in steps S1301 and S1302 are the same as those in steps S701 and S702, description thereof will be omitted. When the high-quality medical image is acquired in step S1302, the process proceeds to step S1303.

ステップＳ１３０３では、領域検出部３０２が、領域検出エンジンを用いて、ステップＳ１３０２で高画質化された画像に対して領域検出処理を行う。なお、検出される領域は、第１～第５の例と同様に、ＦＡＺ及びＮＰＡ等であってよい。このように、高画質化された医用画像に対して領域検出エンジンを適用することで、検出性能を向上できると期待できる。領域検出部３０２が検出した領域を示す情報を出力したら、ＲＯＩ設定部３０３が、当該出力された情報に基づいて、入力画像におけるＲＯＩを設定する。なお、高画質化された医用画像における検出された領域の画素位置は、医用画像及び高画質化された医用画像において対応しているものとすることができる。 In step S1303, the area detection unit 302 uses an area detection engine to perform area detection processing on the image whose image quality has been enhanced in step S1302. Note that the areas to be detected may be FAZ, NPA, etc., as in the first to fifth examples. By applying the region detection engine to the high-quality medical image in this way, it can be expected that the detection performance can be improved. After the area detection unit 302 outputs the information indicating the detected area, the ROI setting unit 303 sets the ROI in the input image based on the output information. Note that the pixel positions of the detected regions in the high-quality medical image can correspond to the medical image and the high-quality medical image.

次に、ステップＳ１３０４において、ＢＣ調整部３０５は、ステップＳ１３０１で取得した医用画像（入力画像）に対して、ステップＳ１３０３で検出した領域に基づくＢＣ調整処理を行う。なお、ＢＣ調整処理は、第２の例で説明した処理と同様の処理であってよい。 Next, in step S1304, the BC adjustment unit 305 performs BC adjustment processing on the medical image (input image) acquired in step S1301 based on the area detected in step S1303. Note that the BC adjustment process may be the same process as the process described in the second example.

最後に、ステップＳ１３０５において、高画質化処理部３０１が、高画質化エンジンを用いて、ステップＳ１３０３においてＢＣ調整処理が行われた医用画像に対して高画質化処理を実行し、高画質化された医用画像を取得する。なお、ステップＳ１３０５における高画質化処理部３０１による高画質化処理後の画像に対して、領域の分類に基づくブレンド処理やＢＣ調整をさらに加えてもよい。ブレンド処理やＢＣ調整はそれぞれ第１の例及び第２の例で説明した処理と同様の処理であってよい。 Finally, in step S1305, the image quality enhancement processing unit 301 uses the image quality enhancement engine to perform image quality enhancement processing on the medical image on which the BC adjustment processing has been performed in step S1303. acquire a medical image. Blending processing and BC adjustment based on region classification may be further added to the image after image quality enhancement processing by the image quality enhancement processing unit 301 in step S1305. Blend processing and BC adjustment may be processing similar to the processing described in the first and second examples, respectively.

上記のように、本例では、本例では、高画質化処理部３０１は、被検体の医用画像を学習データとした学習により得た高画質化モデルを用いて、被検体の医用画像の第１の画像から被検体の医用画像の第１の画像を高画質化した画像を取得する。領域検出部３０２は、被検体の医用画像の第１の画像を高画質化した画像における対象領域を検出する。ＢＣ調整部３０５は、被検体の医用画像の第１の画像における対象領域に対して、対象領域の画素値と対象領域以外の領域の画素値との差が広がるように且つ対象領域の画素値がＢＣ調整処理前の対象領域の画素値より低くなるようにＢＣ調整処理を行う。さらに、高画質化処理部３０１は、高画質化モデルを用いて、ＢＣ調整処理が行われた第１の画像に対して高画質化処理を行い、被検体の医用画像の高画質な第２の画像を取得する。なお、領域検出部３０２は、ＢＣ調整部３０５によりＢＣ調整が行われていない被検体の医用画像の第１の画像を高画質化した画像を用いて対象領域を検出する。 As described above, in this example, the image quality enhancement processing unit 301 uses the image quality enhancement model obtained by learning using the medical image of the subject as learning data to obtain the first image of the medical image of the subject. A high-quality image of the first medical image of the subject is acquired from the first image. The region detection unit 302 detects a target region in an image obtained by enhancing the image quality of the first medical image of the subject. The BC adjusting unit 305 adjusts the pixel values of the target region in the first medical image of the subject so that the difference between the pixel values of the target region and the pixel values of the regions other than the target region widens. is lower than the pixel value of the target area before the BC adjustment process. Furthermore, the image quality improvement processing unit 301 uses the image quality improvement model to perform image quality improvement processing on the first image on which the BC adjustment processing has been performed, and obtains a high quality second image of the medical image of the subject. Get an image of Note that the region detection unit 302 detects the target region using an image obtained by enhancing the image quality of the first medical image of the subject for which the BC adjustment is not performed by the BC adjustment unit 305 .

以上、説明したように、高画質化エンジンと領域検出エンジンを用いた画像処理としては種々の変形が可能である。これらの説明した例は、多数の変形が可能なものであって、本開示を限定して解釈するためのものではない。例えば、医用画像の経過観察を行う場合など、検出した領域に基づくブレンド処理やＢＣ調整処理を同じ条件に合わせる、又はこれらの条件は被検者ごとに記憶して管理してもよい。 As described above, various modifications are possible for the image processing using the image quality improvement engine and the area detection engine. These illustrated examples are capable of many variations and are not intended to be construed as limiting the present disclosure. For example, when performing follow-up observation of medical images, blending processing and BC adjustment processing based on the detected region may be matched to the same conditions, or these conditions may be stored and managed for each subject.

（第２の実施形態）
本発明の第２の実施形態に係る画像処理装置を備える画像処理システムについて図１４を参照しながら詳細に説明する。図１４は、本実施形態に係る一連の画像処理の流れを示すフロー図である。なお、本実施形態に係る画像処理システムの構成は、第１の実施形態に係る画像処理システムの構成と同様であるため、同一の参照番号を用いて説明を省略する。ただし、本実施形態に係る高画質化部においては、領域検出部３０２、ＲＯＩ設定部３０３、ブレンド処理部３０４及びＢＣ調整部３０５は設けられなくてよい。以下、本実施形態に係る画像処理システムについて、第１の実施形態に係る画像処理システムとの違いを中心として説明する。 (Second embodiment)
An image processing system including an image processing apparatus according to the second embodiment of the present invention will be described in detail with reference to FIG. FIG. 14 is a flow chart showing the flow of a series of image processing according to this embodiment. Note that the configuration of the image processing system according to this embodiment is the same as that of the image processing system according to the first embodiment, so the same reference numbers are used and the description is omitted. However, the region detection unit 302, the ROI setting unit 303, the blend processing unit 304, and the BC adjustment unit 305 may not be provided in the image quality enhancement unit according to this embodiment. The image processing system according to the present embodiment will be described below, focusing on differences from the image processing system according to the first embodiment.

第１の実施形態では、深層学習を用いた高画質化エンジンと領域検出エンジンを組み合わせて、部分的に追加の画像補正を行うことで、より好適な高画質画像を得るための方法について説明した。これに対して、本実施形態では、これらの方法を応用して構築した高画質化エンジンを用いて高画質化処理を行う構成について説明する。 In the first embodiment, a method for obtaining a more suitable high-quality image by combining an image quality enhancement engine using deep learning and an area detection engine and partially performing additional image correction has been described. . On the other hand, in the present embodiment, a configuration for performing image quality enhancement processing using an image quality enhancement engine constructed by applying these methods will be described.

本実施形態に係る高画質化エンジンの機械学習アルゴリズムは、第１の実施形態に係る高画質化エンジンの機械学習アルゴリズムと同様のものであってよい。ただし、本実施形態に係る高画質化エンジンの学習データの出力データには、第１の実施形態で説明したいずれかの方法により最終的に生成した高画質画像を用いる。所定枚数の医用画像に対して、これらの処理フローを適用して高画質画像を出力し、それぞれをペアとして学習データセットを構築する。このような学習データを用いて学習を行って得た高画質化エンジンは、学習の傾向に従い、過補正が抑制された高画質画像を生成できることが期待できる。また、部分的に補正した結果を含めて一括で機械学習を行うことで、推論時の画像処理負荷が低減できると期待できる。 The machine learning algorithm of the image quality enhancement engine according to this embodiment may be the same as the machine learning algorithm of the image quality enhancement engine according to the first embodiment. However, the high-quality image finally generated by any of the methods described in the first embodiment is used as the output data of the learning data of the image quality improvement engine according to the present embodiment. These processing flows are applied to a predetermined number of medical images to output high-quality images, which are then paired to construct a learning data set. It can be expected that an image quality improvement engine obtained by performing learning using such learning data can generate a high quality image in which overcorrection is suppressed according to the tendency of learning. In addition, by collectively performing machine learning including partially corrected results, it is expected that the image processing load during inference can be reduced.

図１４を参照して本実施形態の画像処理装置１０１の処理手順について説明する。まず、ステップＳ１４０１において取得部１０１－１は医用画像を取得する。ステップＳ１４０１の処理はステップＳ７０１の処理と同様の処理であってよい。 A processing procedure of the image processing apparatus 101 of this embodiment will be described with reference to FIG. First, in step S1401, the acquisition unit 101-1 acquires a medical image. The processing of step S1401 may be the same as the processing of step S701.

次に、ステップＳ１４０２において、高画質化処理部３０１は、前述のような学習データを用いた学習により得た高画質化エンジンを用いて、ステップＳ１４０１で取得された医用画像に対して高画質化処理を行い、高画質画像を取得する。このように取得された高画質画像は、高画質化エンジンの学習の傾向に従って、過補正が抑制された高画質画像となる。 Next, in step S1402, the image quality enhancement processing unit 301 uses the image quality enhancement engine obtained by learning using the learning data as described above to enhance the image quality of the medical image acquired in step S1401. Process and obtain a high-quality image. The high-quality image acquired in this way becomes a high-quality image in which overcorrection is suppressed according to the learning tendency of the image quality improvement engine.

上記のように、本実施形態に係る画像処理装置は、取得部１０１－１と、高画質化処理部３０１とを備える。取得部１０１－１は、被検体の医用画像の第１の画像を取得する。高画質化処理部３０１は、被検体の医用画像を学習データとする学習により得た第１の学習済モデルを用いた高画質化処理と、被検体の医用画像の対象領域におけるブレンド処理又は明るさ及びコントラストの少なくとも一方の補正処理とが行われた被検体の医用画像を学習データとする学習により得た第２の学習済モデルを用いて、第１の画像に対して高画質化処理を行い被検体の医用画像の第２の画像を取得する。この場合にも、過補正が抑制された高画質画像を生成できる。 As described above, the image processing apparatus according to this embodiment includes the acquisition unit 101-1 and the image quality enhancement processing unit 301. FIG. Acquisition unit 101-1 acquires a first medical image of a subject. The image quality enhancement processing unit 301 performs image quality enhancement processing using a first trained model obtained by learning using a medical image of the subject as learning data, and blend processing or brightness enhancement processing in a target region of the medical image of the subject. Using a second trained model obtained by learning using a medical image of a subject that has undergone correction processing for at least one of brightness and contrast as learning data, image quality enhancement processing is performed on the first image. A second image of the medical image of the subject is obtained. Also in this case, it is possible to generate a high-quality image in which overcorrection is suppressed.

なお、第１の実施形態及び第２の実施形態に係るいずれかの方法によって高画質化した医用画像を用いて各種の画像解析を行うことで、より確度の高い解析を行うことができる。また、人工知能エンジンによる疾病スクリーニングについても、高画質化した画像を入力画像として利用することができ、より精度の高い処理ができると期待できる。 By performing various types of image analysis using a medical image whose image quality has been enhanced by any of the methods according to the first and second embodiments, more accurate analysis can be performed. Also, for disease screening using an artificial intelligence engine, high-quality images can be used as input images, and more accurate processing can be expected.

［変形例１］
第１の実施形態において、領域検出部３０２が学習済モデルを用いて対象領域を検出する場合の学習データとしては、複数の深度範囲に対応する複数の正面画像を用いることができる。例えば、学習済モデルを用いた領域検出エンジンの学習データの入力データとして、複数の深度範囲に対応する複数のＯＣＴＡ正面画像を用いることができる。なお、学習データの出力データとしては、入力データとして用いたＯＣＴＡ正面画像に対応する領域ラベル画像を用いてよい。領域検出部３０２は、このような学習データの学習により得た共通の学習済モデルを用いることで、任意の深度範囲、例えば、検者からの指示に応じて選択された深度範囲に対応する被検体の正面画像から、２次元の対象領域を検出することができる。なお、深度範囲としては、例えば、表層、深層、外層、及び脈絡膜血管網や、基準となる層とオフセットの値を変えた異なる深度範囲等を含んでよい。また、正面画像はＯＣＴＡ正面画像に限られず、Ｅｎ－Ｆａｃｅ画像であってもよい。 [Modification 1]
In the first embodiment, a plurality of front images corresponding to a plurality of depth ranges can be used as learning data when the region detection unit 302 detects a target region using a trained model. For example, a plurality of OCTA frontal images corresponding to a plurality of depth ranges can be used as input data for learning data of an area detection engine using a trained model. As the output data of the learning data, the area label image corresponding to the OCTA front image used as the input data may be used. The region detection unit 302 uses a common learned model obtained by learning such learning data to detect an object corresponding to an arbitrary depth range, for example, a depth range selected according to an instruction from the examiner. A two-dimensional target region can be detected from the front image of the specimen. Note that the depth range may include, for example, the superficial layer, the deep layer, the outer layer, the choroidal vascular network, and different depth ranges with different offset values from the reference layer. Further, the front image is not limited to the OCTA front image, and may be an En-Face image.

また、深度範囲毎の学習データを用いて、深度範囲に対応する複数の学習済モデルを用意してもよい。この場合、領域検出部３０２は、複数の深度範囲に対応する被検体の複数の正面画像をそれぞれの学習データとした学習により得た複数の学習済モデルのうち、検者からの指示に応じて選択された深度範囲に対応する学習済モデルを選択し、選択された学習済モデルを用いて、被検体の正面画像から２次元の対象領域を検出してもよい。また、領域検出部３０２は、このような複数の学習済モデルのうち、領域検出処理に用いる医用画像の深度範囲に対応する学習済モデルを選択し、選択された学習済モデルを用いて、被検体の正面画像から２次元の対象領域を検出してもよい。 Also, a plurality of trained models corresponding to depth ranges may be prepared using learning data for each depth range. In this case, the area detection unit 302 selects a plurality of learned models obtained by learning using a plurality of front images of the subject corresponding to a plurality of depth ranges as learning data, and selects the A trained model corresponding to the selected depth range may be selected, and a two-dimensional region of interest may be detected from the frontal image of the subject using the selected trained model. Further, the region detection unit 302 selects a trained model corresponding to the depth range of the medical image used for region detection processing from among the plurality of such trained models, and uses the selected trained model to detect the subject. A two-dimensional target area may be detected from the front image of the specimen.

さらに、学習データとして、異なる深度範囲の正面画像を複数組み合わせて（例えば、表層側と深層側で複数の正面画像を分けて）用いてもよい。この場合も、領域検出部３０２は、検者からの指示に応じて選択された深度範囲又は領域検出処理に用いる医用画像の深度範囲に対応する学習済モデルを選択し、選択された学習済モデルを用いて、被検体の正面画像から２次元の対象領域を検出することができる。 Furthermore, as learning data, a plurality of front images of different depth ranges may be combined and used (for example, a plurality of front images may be divided into the surface layer side and the deep layer side). Also in this case, the region detection unit 302 selects a learned model corresponding to the depth range selected according to an instruction from the examiner or the depth range of the medical image used for region detection processing, and selects the selected learned model. can be used to detect a two-dimensional region of interest from the frontal image of the subject.

さらに、学習済モデルを用いた領域検出エンジンの学習データの入力データとして、３次元画像を用いてもよい。例えば、学習データの入力データとして、３次元の断層画像や３次元のモーションコントラスト画像を用い、学習データの出力データとして、当該３次元の画像についてラベル付けを行った３次元の領域ラベル画像を用いることができる。この場合、領域検出部３０２は、被検体の３次元画像を学習データとした学習により得た学習済モデルを用いて、被検体の３次元画像から３次元の対象領域を検出することができる。 Furthermore, a three-dimensional image may be used as input data for learning data of an area detection engine using a trained model. For example, a three-dimensional tomographic image or a three-dimensional motion contrast image is used as input data for learning data, and a three-dimensional region label image obtained by labeling the three-dimensional image is used as output data for learning data. be able to. In this case, the region detection unit 302 can detect the three-dimensional target region from the three-dimensional image of the subject using a trained model obtained by learning using the three-dimensional image of the subject as learning data.

同様に、高画質化エンジンについても、３次元画像を入力データとし、高画質化した３次元画像を出力データとした学習データを用いて学習を行ってもよい。この場合、高画質化処理部３０１は、被検体の３次元の医用画像を学習データとした学習により得た高画質化エンジンを用いて、被検体の３次元の医用画像から高画質化した３次元画像を取得することができる。さらに、ブレンド処理部３０４やＢＣ調整部３０５を３次元の医用画像における３次元の対象領域について各種処理を行うように構成することができる。この場合には、３次元の医用画像について処理を行うことで、第１の実施形態と同様の効果を奏することができる。また、同様に、第２の実施形態に係る高画質化エンジンの学習データに３次元画像を用いてもよい。この場合には、３次元の医用画像について処理を行うことで、第２の実施形態と同様の効果を奏することができる。 Similarly, the image quality enhancement engine may also perform learning using learning data in which a three-dimensional image is input data and a three-dimensional image with enhanced image quality is output data. In this case, the image quality enhancement processing unit 301 uses an image quality enhancement engine obtained by learning using a three-dimensional medical image of the subject as learning data to obtain a three-dimensional image of the three-dimensional medical image of the subject. A dimensional image can be acquired. Furthermore, the blend processing unit 304 and the BC adjustment unit 305 can be configured to perform various types of processing on a three-dimensional target region in a three-dimensional medical image. In this case, the same effect as in the first embodiment can be obtained by processing a three-dimensional medical image. Similarly, a three-dimensional image may be used as learning data for the image quality enhancement engine according to the second embodiment. In this case, the same effect as in the second embodiment can be obtained by processing a three-dimensional medical image.

［変形例２］
高画質化処理や領域検出処理に用いる学習済モデル（高画質化用の学習済モデル、領域検出用の学習済モデル）を被検者毎に調整（チューニング）する学習を行い、その被検者専用の学習済モデルを生成してもよい。例えば、被検者の過去の検査において取得された医用画像を用いて、高画質な医用画像を生成するための汎用的な学習済モデルや領域を検出するための汎用的な学習済モデルの転移学習を行い、その被検者専用の学習済モデルを生成することができる。被検者専用の学習済モデルを被検者のＩＤと紐付けて記憶部１０１－３やサーバ等の外部装置に記憶させておくことで、画像処理装置１０１は、被検者の現在の検査を行う際に、被検者のＩＤに基づいて被検者専用の学習済モデルを特定し、利用することができる。被検者専用の学習済モデルを用いることで、高画質化処理や領域検出処理の精度を向上させることができる。 [Modification 2]
Perform learning to adjust (tune) the trained model used for image quality improvement processing and region detection processing (learned model for image quality improvement, trained model for region detection) for each subject, A dedicated trained model may be generated. For example, transfer of a general-purpose trained model for generating high-quality medical images and a general-purpose trained model for detecting regions using medical images acquired in past examinations of the subject. Training can be performed to generate a trained model specific to that subject. By linking the trained model dedicated to the subject with the ID of the subject and storing it in the storage unit 101-3 or an external device such as a server, the image processing apparatus 101 can perform the current examination of the subject. is performed, a trained model dedicated to the subject can be specified and used based on the ID of the subject. By using the trained model dedicated to the subject, it is possible to improve the accuracy of the image quality enhancement process and the area detection process.

［変形例３］
高画質化エンジンは、入力データである各種画像の種類毎に用意されてもよい。例えば、前眼画像用の高画質化モデルや、ＳＬＯ画像用の高画質化モデル、断層画像用の高画質化モデル、ＯＣＴＡ正面画像用の高画質化モデル等が用意されてよい。また、ＯＣＴＡ正面画像やＥｎ－Ｆａｃｅ画像については、画像を生成するための深度範囲毎に高画質化モデルが用意されてもよい。例えば、表層用の高画質化モデルや深層用の高画質化モデル等が用意されてよい。さらに、高画質化モデルは、撮影部位（例えば、黄斑部中心、視神経乳頭部中心）毎の画像について学習を行ったものでもよいし、撮影部位に関わらず学習を行ったものであってもよい。 [Modification 3]
An image quality improvement engine may be prepared for each type of image that is input data. For example, an image quality enhancement model for anterior eye images, an image quality enhancement model for SLO images, an image quality enhancement model for tomographic images, an image quality enhancement model for OCTA frontal images, and the like may be prepared. Further, for the OCTA front image and the En-Face image, a high image quality model may be prepared for each depth range for generating the image. For example, a high image quality model for surface layers and a high image quality model for deep layers may be prepared. Furthermore, the image quality improvement model may be obtained by learning an image for each imaging site (for example, the center of the macula and the center of the optic papilla), or may be obtained by learning regardless of the imaging site. .

このとき、例えば、眼底ＯＣＴＡ正面画像を学習データとして学習して得た高画質化モデルを用いて、眼底ＯＣＴＡ正面画像を高画質化し、さらに、前眼ＯＣＴＡ正面画像を学習データとして学習して得た高画質化モデルを用いて、前眼ＯＣＴＡ正面画像を高画質化してもよい。また、高画質化モデルは、撮影部位を関わらず学習を行ったものであってもよい。ここで、例えば、眼底ＯＣＴＡ正面画像及び前眼ＯＣＴＡ正面画像は、撮影対象である血管の分布の様子が互いに比較的類似していることがある。このように、撮影対象の様子が互いに比較的類似しているような複数の種類の医用画像では、互いの特徴量が比較的類似していることがある。そこで、例えば、眼底ＯＣＴＡ正面画像を学習データとして学習して得た高画質化モデルを用いて、眼底ＯＣＴＡ正面画像を高画質化するだけでなく、前眼ＯＣＴＡ正面画像も高画質化可能に構成されてもよい。また、例えば、前眼ＯＣＴＡ正面画像を学習データとして学習して得た高画質化モデルを用いて、前眼ＯＣＴＡ正面画像を高画質化するだけでなく、眼底ＯＣＴＡ正面画像も高画質化可能に構成されてもよい。すなわち、眼底ＯＣＴＡ正面画像と前眼ＯＣＴＡ正面画像との少なくとも一つの種類の正面画像を学習データとして学習して得た高画質化モデルを用いて、眼底ＯＣＴＡ正面画像と前眼ＯＣＴＡ正面画像との少なくとも一つの種類の正面画像を高画質化可能に構成されてもよい。 At this time, for example, the image quality of the fundus OCTA frontal image is improved using a high image quality model obtained by learning the fundus OCTA frontal image as learning data, and the anterior eye OCTA frontal image is learned as learning data. The image quality improvement model may be used to improve the image quality of the anterior ocular OCTA front image. Also, the high image quality model may be one that has undergone learning regardless of the imaging region. Here, for example, the fundus OCTA front image and the anterior eye OCTA front image may be relatively similar in distribution of blood vessels to be imaged. In this way, in a plurality of types of medical images in which the states of imaging targets are relatively similar to each other, feature amounts may be relatively similar to each other. Therefore, for example, using a high image quality model obtained by learning the fundus OCTA front image as learning data, not only the image quality of the fundus OCTA front image can be improved, but also the anterior eye OCTA front image can be improved. may be In addition, for example, it is possible to improve not only the image quality of the anterior eye OCTA front image but also the fundus OCTA front image by using a high image quality model obtained by learning the anterior eye OCTA front image as learning data. may be configured. That is, using a high image quality model obtained by learning at least one type of frontal image of the fundus OCTA frontal image and the anterior eye OCTA frontal image as learning data, the fundus OCTA frontal image and the anterior eye OCTA frontal image are obtained. At least one type of front image may be configured to be capable of high image quality.

ここで、眼底撮影可能なＯＣＴ装置において、前眼も撮影可能である場合を考える。このとき、ＯＣＴＡのＥｎ－Ｆａｃｅ画像には、例えば、眼底撮影モードにおいては眼底ＯＣＴＡ正面画像が適用され、また、前眼部撮影モードにおいては前眼ＯＣＴＡ正面画像が適用されてもよい。このとき、高画質化ボタンが押下されると、例えば、眼底撮影モードにおいては、ＯＣＴＡのＥｎ－Ｆａｃｅ画像の表示領域において、低画質の眼底ＯＣＴＡ正面画像と高画質の眼底ＯＣＴＡ正面画像とのうち一方の表示が他方の表示に変更されるように構成されてもよい。また、高画質化ボタンが押下されると、例えば、前眼部撮影モードにおいては、ＯＣＴＡのＥｎ－Ｆａｃｅ画像の表示領域において、低画質の前眼ＯＣＴＡ正面画像と高画質の前眼ＯＣＴＡ正面画像とのうち一方の表示が他方の表示に変更されるように構成されてもよい。 Here, consider a case where an OCT apparatus capable of photographing the fundus can also photograph the anterior eye. At this time, for the OCTA En-Face image, for example, the fundus OCTA front image may be applied in the fundus imaging mode, and the anterior eye OCTA front image may be applied in the anterior segment imaging mode. At this time, when the image quality enhancement button is pressed, for example, in the fundus imaging mode, in the display area of the OCTA En-Face image, the image of the low-quality fundus OCTA front image and the high-quality fundus OCTA front image is displayed. The display of one may be configured to change to the display of the other. Further, when the image quality enhancement button is pressed, for example, in the anterior segment imaging mode, a low-quality anterior OCTA front image and a high-quality anterior OCTA front image are displayed in the display area of the OCTA En-Face image. may be configured such that the display of one of them is changed to the display of the other.

なお、眼底撮影可能なＯＣＴ装置において、前眼も撮影可能とする場合に、前眼アダプタが装着可能に構成されてもよい。また、前眼アダプタを用いずに、ＯＣＴ装置の光学系が被検眼の眼軸長程度の距離、移動可能に構成されてもよい。このとき、ＯＣＴ装置のフォーカス位置が前眼に結像する程度、正視側に大きく変更可能に構成されてもよい。 An OCT apparatus capable of photographing the fundus may be configured so that an anterior eye adapter can be attached when the anterior eye can also be photographed. Alternatively, the optical system of the OCT apparatus may be configured to be movable by a distance approximately equal to the axial length of the subject's eye without using the anterior eye adapter. At this time, the focus position of the OCT apparatus may be configured so as to be able to be greatly changed to the normal vision side to the extent that an image is formed on the anterior eye.

また、断層画像には、例えば、眼底撮影モードにおいては眼底ＯＣＴ断層画像が適用され、また、前眼部撮影モードにおいては前眼ＯＣＴ断層画像が適用されてもよい。また、上述した眼底ＯＣＴＡ正面画像及び前眼ＯＣＴＡ正面画像の高画質化処理は、例えば、眼底ＯＣＴ断層画像及び前眼ＯＣＴ断層画像の高画質化処理として適用することも可能である。このとき、高画質化ボタンが押下されると、例えば、眼底撮影モードにおいては、断層画像の表示領域において、低画質の眼底ＯＣＴ断層画像と高画質の眼底ＯＣＴ断層画像とのうち一方の表示が他方の表示に変更されるように構成されてもよい。また、高画質化ボタンが押下されると、例えば、前眼部撮影モードにおいては、断層画像の表示領域において、低画質の前眼ＯＣＴ断層画像と高画質の前眼ＯＣＴ断層画像とのうち一方の表示が他方の表示に変更されるように構成されてもよい。 For the tomographic image, for example, a fundus OCT tomographic image may be applied in the fundus imaging mode, and an anterior eye OCT tomographic image may be applied in the anterior segment imaging mode. Further, the image quality enhancement processing for the fundus OCTA front image and the anterior eye OCTA front image described above can also be applied as image quality enhancement processing for the fundus OCT tomographic image and the anterior eye OCT tomographic image, for example. At this time, when the image quality enhancement button is pressed, for example, in the fundus imaging mode, one of the low-quality fundus OCT tomographic image and the high-quality fundus OCT tomographic image is displayed in the tomographic image display area. It may be configured to change to the other display. Further, when the image quality enhancement button is pressed, for example, in the anterior segment imaging mode, one of the low image quality anterior eye OCT tomographic image and the high image quality anterior eye OCT tomographic image is displayed in the tomographic image display area. may be configured such that the display of one is changed to the display of the other.

また、断層画像には、例えば、眼底撮影モードにおいては眼底ＯＣＴＡ断層画像が適用され、また、前眼部撮影モードにおいては前眼ＯＣＴＡ断層画像が適用されてもよい。また、上述した眼底ＯＣＴＡ正面画像及び前眼ＯＣＴＡ正面画像の高画質化処理は、例えば、眼底ＯＣＴＡ断層画像及び前眼ＯＣＴＡ断層画像の高画質化処理として適用することも可能である。このとき、例えば、眼底撮影モードにおいては、断層画像の表示領域において、眼底ＯＣＴＡ断層画像における血管領域（例えば、閾値以上のモーションコントラストデータ）を示す情報が、対応する位置の眼底ＯＣＴ断層画像に重畳して表示されるように構成されてもよい。また、例えば、前眼部撮影モードにおいては、断層画像の表示領域において、前眼ＯＣＴＡ断層画像における血管領域を示す情報が、対応する位置の前眼ＯＣＴ断層画像に重畳して表示されてもよい。 For the tomographic image, for example, a fundus OCTA tomographic image may be applied in the fundus imaging mode, and an anterior eye OCTA tomographic image may be applied in the anterior segment imaging mode. Further, the image quality enhancement processing of the fundus OCTA front image and the anterior eye OCTA front image described above can also be applied as the image quality enhancement processing of the fundus OCTA tomographic image and the anterior eye OCTA tomographic image, for example. At this time, for example, in the fundus imaging mode, in the display area of the tomographic image, information indicating a blood vessel region (for example, motion contrast data equal to or greater than a threshold value) in the fundus OCTA tomographic image is superimposed on the fundus OCT tomographic image at the corresponding position. may be configured to be displayed as Further, for example, in the anterior segment imaging mode, information indicating the blood vessel region in the anterior eye OCTA tomographic image may be superimposed and displayed on the anterior eye OCT tomographic image at the corresponding position in the tomographic image display region. .

このように、例えば、複数の種類の医用画像の特徴量（撮影対象の様子）が互いに比較的類似していると考えられるような場合には、複数の種類の医用画像の少なくとも一つの種類の医用画像を学習データとして学習して得た高画質化モデルを用いて、複数の種類の医用画像の少なくとも一つの種類の医用画像を高画質化可能に構成されてもよい。これにより、例えば、共通の学習済モデル（共通の高画質化モデル）を用いて、複数の種類の医用画像の高画質化を実行可能に構成することができる。 In this way, for example, when the feature values (appearance of an object to be imaged) of a plurality of types of medical images are considered to be relatively similar to each other, at least one of the plurality of types of medical images A high image quality model obtained by learning medical images as learning data may be used to improve the image quality of at least one type of medical image among a plurality of types of medical images. As a result, for example, a common learned model (common image quality improvement model) can be used to enable the image quality improvement of multiple types of medical images.

なお、眼底撮影モードの表示画面と前眼部撮影モードの表示画面とは、同じ表示レイアウトであってもよいし、それぞれの撮影モードに対応する表示レイアウトであってもよい。眼底撮影モードと前眼部撮影モードとで、撮影条件や解析条件等の種々の条件が同じであってもよいし、異なっていてもよい。 The display screen in the fundus imaging mode and the display screen in the anterior segment imaging mode may have the same display layout, or may have display layouts corresponding to the respective imaging modes. Various conditions such as imaging conditions and analysis conditions may be the same or different between the fundus imaging mode and the anterior segment imaging mode.

ここで、高画質化処理の対象画像は、例えば、（複数の深度範囲に対応する）複数のＯＣＴＡ正面画像（ＯＣＴＡのＥｎ－Ｆａｃｅ画像、モーションコントラストのＥｎ－Ｆａｃｅ画像）であってもよい。また、高画質化処理の対象画像は、例えば、１つの深度範囲に対応する１つのＯＣＴＡ正面画像であってもよい。また、高画質化処理の対象画像は、ＯＣＴＡ正面画像の代わりに、例えば、輝度の正面画像（輝度のＥｎ－Ｆａｃｅ画像）、あるいはＢスキャン画像であるＯＣＴ断層画像やモーションコントラストデータの断層画像（ＯＣＴＡ断層画像）であってもよい。また、高画質化処理の対象画像は、ＯＣＴＡ正面画像だけでなく、例えば、輝度の正面画像及びＢスキャン画像であるＯＣＴ断層画像やモーションコントラストデータの断層画像（ＯＣＴＡ断層画像）等の種々の医用画像であってもよい。すなわち、高画質化処理の対象画像は、例えば、出力部１０３の表示画面上に表示されている種々の医用画像の少なくとも１つであればよい。このとき、例えば、画像の種類毎に画像の特徴量が異なる場合があるため、高画質化処理の対象画像の各種類に対応する高画質化用の学習済モデルが用いられてもよい。例えば、検者からの指示に応じて高画質化ボタンが押下されると、ＯＣＴＡ正面画像に対応する高画質化用の学習済モデルを用いてＯＣＴＡ正面画像を高画質化処理するだけでなく、ＯＣＴ断層画像に対応する高画質化用の学習済モデルを用いてＯＣＴ断層画像も高画質化処理するように構成されてもよい。また、例えば、検者からの指示に応じて高画質化ボタンが押下されると、ＯＣＴＡ正面画像に対応する高画質化用の学習済モデルを用いて生成された高画質なＯＣＴＡ正面画像の表示に変更されるだけでなく、ＯＣＴ断層画像に対応する高画質化用の学習済モデルを用いて生成された高画質なＯＣＴ断層画像の表示に変更されるように構成されてもよい。このとき、ＯＣＴ断層画像の位置を示すラインがＯＣＴＡ正面画像に重畳表示されるように構成されてもよい。また、上記ラインは、検者からの指示に応じてＯＣＴＡ正面画像上で移動可能に構成されてもよい。また、高画質化ボタンの表示がアクティブ状態である場合には、上記ラインが移動された後に、現在のラインの位置に対応するＯＣＴ断層画像を高画質化処理して得た高画質なＯＣＴ断層画像の表示に変更されるように構成されてもよい。また、高画質化処理の対象画像毎に高画質化ボタンが表示されることで、画像毎に独立して高画質化処理可能に構成されてもよい。 Here, the target image of the image quality enhancement process may be, for example, a plurality of OCTA front images (OCTA En-Face images, motion contrast En-Face images) (corresponding to a plurality of depth ranges). Also, the target image for image quality enhancement processing may be, for example, one OCTA front image corresponding to one depth range. In addition, instead of the OCTA front image, the target image of the image quality improvement process is, for example, a luminance front image (luminance En-Face image), an OCT tomographic image that is a B-scan image, or a tomographic image of motion contrast data ( OCTA tomographic image). In addition, the target image of the image quality improvement process is not only the OCTA front image, but also various medical images such as OCT tomographic images that are luminance front images and B-scan images and tomographic images of motion contrast data (OCTA tomographic images). It may be an image. That is, the image to be subjected to the image quality enhancement process may be, for example, at least one of various medical images displayed on the display screen of the output unit 103 . At this time, for example, since the feature amount of an image may differ for each type of image, a trained model for image quality improvement corresponding to each type of image to be subjected to image quality improvement processing may be used. For example, when the image quality improvement button is pressed in response to an instruction from the examiner, the OCTA frontal image is not only processed to improve the image quality using the trained model for image quality improvement corresponding to the OCTA frontal image, but also The OCT tomographic image may also be configured to perform image quality enhancement processing using a trained model for image quality enhancement corresponding to the OCT tomographic image. Further, for example, when an image quality enhancement button is pressed in response to an instruction from the examiner, a high-quality OCTA frontal image generated using a trained model for image quality enhancement corresponding to the OCTA frontal image is displayed. , and may be configured to display a high-quality OCT tomographic image generated using a trained model for improving image quality corresponding to the OCT tomographic image. At this time, a line indicating the position of the OCT tomographic image may be superimposed on the OCTA front image. Further, the line may be configured to be movable on the OCTA front image according to an instruction from the examiner. Further, when the display of the image quality improvement button is in an active state, after the line is moved, a high image quality OCT tomogram obtained by performing image quality improvement processing on the OCT tomographic image corresponding to the position of the current line It may be configured to be changed to display an image. Further, by displaying an image quality improvement button for each image to be subjected to the image quality improvement process, the image quality improvement process may be performed independently for each image.

また、ＯＣＴＡ断層画像における血管領域（例えば、閾値以上のモーションコントラストデータ）を示す情報が、対応する位置のＢスキャン画像であるＯＣＴ断層画像に重畳して表示されてもよい。このとき、例えば、ＯＣＴ断層画像が高画質化されると、対応する位置のＯＣＴＡ断層画像が高画質化されてもよい。そして、高画質化して得たＯＣＴＡ断層画像における血管領域を示す情報が、高画質化して得たＯＣＴ断層画像に重畳して表示されてもよい。なお、血管領域を示す情報は、色等の識別可能な情報であれば何でもよい。また、血管領域を示す情報の重畳表示と非表示とが検者からの指示に応じて変更可能に構成されてもよい。また、ＯＣＴ断層画像の位置を示すラインがＯＣＴＡ正面画像上で移動されると、ラインの位置に応じてＯＣＴ断層画像の表示が更新されてもよい。このとき、対応する位置のＯＣＴＡ断層画像も更新されるため、ＯＣＴＡ断層画像から得られる血管領域を示す情報の重畳表示が更新されてもよい。これにより、例えば、任意の位置において、血管領域と注目領域との位置関係を容易に確認しながら、血管領域の３次元の分布や状態を効果的に確認することができる。また、ＯＣＴＡ断層画像の高画質化は、高画質化用の学習済モデルを用いる代わりに、対応する位置で取得した複数のＯＣＴＡ断層画像の加算平均処理等による高画質化処理であってもよい。また、ＯＣＴ断層画像は、ＯＣＴボリュームデータにおける任意の位置の断面として再構成された疑似ＯＣＴ断層画像であってもよい。また、ＯＣＴＡ断層画像は、ＯＣＴＡボリュームデータにおける任意の位置の断面として再構成された疑似ＯＣＴＡ断層画像であってもよい。なお、任意の位置は、少なくとも１つの任意の位置であればよく、また、検者からの指示に応じて変更可能に構成されてもよい。このとき、複数の位置に対応する複数の疑似断層画像が再構成されるように構成されてもよい。 Information indicating a blood vessel region (for example, motion contrast data equal to or greater than a threshold) in an OCTA tomographic image may be superimposed and displayed on an OCT tomographic image, which is a B-scan image of the corresponding position. At this time, for example, when the quality of the OCT tomographic image is improved, the quality of the OCTA tomographic image at the corresponding position may be improved. Information indicating the blood vessel region in the OCTA tomographic image obtained with high image quality may be superimposed and displayed on the OCT tomographic image obtained with high image quality. The information indicating the blood vessel region may be any information as long as it is identifiable information such as color. In addition, superimposed display and non-display of information indicating a blood vessel region may be configured to be changeable according to an instruction from the examiner. Further, when the line indicating the position of the OCT tomographic image is moved on the OCTA front image, the display of the OCT tomographic image may be updated according to the position of the line. At this time, since the OCTA tomographic image at the corresponding position is also updated, the superimposed display of the information indicating the blood vessel region obtained from the OCTA tomographic image may be updated. As a result, for example, it is possible to effectively confirm the three-dimensional distribution and state of the blood vessel region while easily confirming the positional relationship between the blood vessel region and the region of interest at an arbitrary position. Further, the improvement of the image quality of the OCTA tomographic image may be the image quality improvement processing such as averaging processing of a plurality of OCTA tomographic images acquired at the corresponding position instead of using the trained model for image quality improvement. . Also, the OCT tomographic image may be a pseudo OCT tomographic image reconstructed as a cross section at an arbitrary position in OCT volume data. Also, the OCTA tomographic image may be a pseudo OCTA tomographic image reconstructed as a cross section at an arbitrary position in the OCTA volume data. The arbitrary position may be at least one arbitrary position, and may be configured to be changeable according to an instruction from the examiner. At this time, a plurality of pseudo tomographic images corresponding to a plurality of positions may be reconstructed.

なお、表示される断層画像（例えば、ＯＣＴ断層画像あるいはＯＣＴＡ断層画像）は、１つだけ表示されてもよいし、複数表示されてもよい。複数の断層画像が表示される場合には、それぞれ異なる副走査方向の位置で取得された断層画像が表示されてもよいし、例えばクロススキャン等により得られた複数の断層画像を高画質化して表示する場合には、異なる走査方向の画像がそれぞれ表示されてもよい。また、例えばラジアルスキャン等により得られた複数の断層画像を高画質化して表示する場合には、一部選択された複数の断層画像（例えば基準ラインに対して互いに対称な位置の２つの断層画像）がそれぞれ表示されてもよい。さらに、経過観察用の表示画面（フォローアップ用の表示画面）に複数の断層画像を表示し、上述の方法と同様の手法により高画質化の指示や解析結果（例えば、特定の層の厚み等）の表示が行われてもよい。このとき、表示される複数の断層画像は、被検眼の所定部位の異なる日時に得た複数の断層画像であってもよいし、同一検査日の異なる時間に得た複数の断層画像であってもよい。また、上述の方法と同様の手法によりデータベースに保存されている情報に基づいて断層画像に高画質化処理を実行してもよい。 Note that only one tomographic image (for example, an OCT tomographic image or an OCTA tomographic image) may be displayed, or a plurality of tomographic images may be displayed. When a plurality of tomographic images are displayed, the tomographic images obtained at different positions in the sub-scanning direction may be displayed. When displayed, images in different scanning directions may be displayed. For example, when displaying a plurality of tomographic images obtained by radial scanning or the like with high image quality, a plurality of partially selected tomographic images (for example, two tomographic images at mutually symmetrical positions with respect to the reference line) ) may be displayed respectively. In addition, multiple tomographic images are displayed on the display screen for follow-up observation (display screen for follow-up), and instructions for higher image quality and analysis results (for example, the thickness of a specific layer, etc.) are obtained by the same method as the above method. ) may be displayed. At this time, the plurality of tomographic images to be displayed may be a plurality of tomographic images of a predetermined portion of the subject's eye obtained at different dates and times, or may be a plurality of tomographic images obtained at different times on the same examination date. good too. Further, image quality enhancement processing may be performed on the tomographic image based on the information stored in the database by a method similar to the method described above.

同様に、ＳＬＯ画像を高画質化して表示する場合には、例えば、同一の表示画面に表示されるＳＬＯ画像を高画質化して表示してもよい。さらに、輝度の正面画像を高画質化して表示する場合には、例えば、同一の表示画面に表示される輝度の正面画像を高画質化して表示してよい。さらに、経過観察用の表示画面に複数のＳＬＯ画像や輝度の正面画像を表示し、上述の方法と同様の手法により高画質化の指示や解析結果（例えば、特定の層の厚み等）の表示が行われてもよい。また、上述の方法と同様の手法によりデータベースに保存されている情報に基づいてＳＬＯ画像や輝度の正面画像に高画質化処理を実行してもよい。なお、断層画像、ＳＬＯ画像、及び輝度の正面画像の表示は例示であり、これらの画像は所望の構成に応じて任意の態様で表示されてよい。また、ＯＣＴＡ正面画像、断層画像、ＳＬＯ画像、及び輝度の正面画像の少なくとも２つ以上が、一度の指示で高画質化され表示されてもよい。 Similarly, when the SLO image is displayed with high image quality, for example, the SLO image displayed on the same display screen may be displayed with high image quality. Furthermore, in the case of displaying the front image with luminance in a higher quality, for example, the front image with luminance displayed on the same display screen may be displayed with a higher image quality. Furthermore, a plurality of SLO images and a front image of brightness are displayed on the display screen for follow-up observation, and instructions for improving image quality and analysis results (e.g., thickness of a specific layer, etc.) are displayed by the same method as the above method. may be performed. Further, image quality enhancement processing may be performed on the SLO image or the luminance front image based on the information stored in the database by a technique similar to the above-described method. It should be noted that the display of the tomographic image, the SLO image, and the luminance front image is an example, and these images may be displayed in any manner according to the desired configuration. Also, at least two or more of the OCTA front image, the tomographic image, the SLO image, and the luminance front image may be displayed with high image quality by a single instruction.

このような構成により、高画質化処理して得た高画質画像を表示制御部１０１－５が出力部１０３に表示させることができる。なお、高画質画像の表示、解析結果の表示、表示される正面画像の深度範囲等に関する複数の条件のうち少なくとも１つの条件が選択されている場合には、表示画面が遷移されても、選択された条件が維持されるように構成されてもよい。なお、各種高画質画像や上記ライン、血管領域を示す情報等の表示の制御は、表示制御部１０１－５によって行われてよい。 With such a configuration, the display control unit 101-5 can cause the output unit 103 to display a high-quality image obtained by the high-quality image processing. Note that if at least one of a plurality of conditions related to the display of high-quality images, the display of analysis results, the depth range of the front image to be displayed, etc. is selected, even if the display screen transitions, the selection It may be configured such that the specified conditions are maintained. The display control unit 101-5 may control the display of various high-quality images, information indicating the lines and blood vessel regions, and the like.

また、高画質化モデルは、表示制御部１０１－５によって出力部１０３に表示されるプレビュー画面において、ライブ動画像のすくなくとも１つのフレーム毎に用いられてもよい。このとき、プレビュー画面において、異なる部位や異なる種類の複数のライブ動画像が表示されている場合には、各ライブ動画像に対応する学習済モデルが用いられるように構成されてもよい。例えば、アライメント処理に用いる前眼画像について、前眼画像用の高画質化モデルを用いて高画質化された画像を用いてもよい。同様に各種画像における所定領域の検出処理について用いられる各種画像について、それぞれの画像用の高画質化モデルを用いて高画質化された画像を用いてもよい。 Also, the high image quality model may be used for at least one frame of the live moving image on the preview screen displayed on the output unit 103 by the display control unit 101-5. At this time, when a plurality of live moving images of different parts or different types are displayed on the preview screen, the learned model corresponding to each live moving image may be used. For example, an anterior eye image used for alignment processing may be an image whose image quality has been enhanced using an image quality enhancement model for an anterior eye image. Similarly, for various images used in the process of detecting a predetermined region in various images, images whose image quality has been enhanced using the image quality enhancement model for each image may be used.

このとき、例えば、検者からの指示に応じて高画質化ボタンが押下された場合には、異なる種類の複数のライブ動画像（例えば、前眼画像、ＳＬＯ画像、断層画像）の表示を（同時に）、それぞれ高画質化処理されることにより得た高画質動画像の表示に変更されるように構成されてもよい。このとき、高画質動画像の表示は、各フレームを高画質化処理して得た高画質画像の連続表示であってもよい。また、例えば、画像の種類毎に画像の特徴量が異なる場合があるため、高画質化処理の対象画像の各種類に対応する高画質化用の学習済モデルが用いられてもよい。例えば、検者からの指示に応じて高画質化ボタンが押下されると、前眼画像に対応する高画質化モデルを用いて前眼画像を高画質化処理するだけでなく、ＳＬＯ画像に対応する高画質化モデルを用いてＳＬＯ画像も高画質化処理するように構成されてもよい。また、例えば、検者からの指示に応じて高画質化ボタンが押下されると、前眼画像に対応する高画質化モデルを用いて生成された高画質な前眼画像の表示に変更されるだけでなく、ＳＬＯ画像に対応する高画質化モデルを用いて生成された高画質なＳＬＯ画像の表示に変更されるように構成されてもよい。また、例えば、検者からの指示に応じて高画質化ボタンが押下されると、ＳＬＯ画像に対応する高画質化モデルを用いてＳＬＯ画像を高画質化処理するだけでなく、断層画像に対応する高画質化モデルを用いて断層画像も高画質化処理するように構成されてもよい。また、例えば、検者からの指示に応じて高画質化ボタンが押下されると、ＳＬＯ画像に対応する高画質化モデルを用いて生成された高画質なＳＬＯ画像の表示に変更されるだけでなく、断層画像に対応する高画質化モデルを用いて生成された高画質な断層画像の表示に変更されるように構成されてもよい。このとき、断層画像の位置を示すラインがＳＬＯ画像に重畳表示されるように構成されてもよい。また、上記ラインは、検者からの指示に応じてＳＬＯ画像上で移動可能に構成されてもよい。また、高画質化ボタンの表示がアクティブ状態である場合には、上記ラインが移動された後に、現在のラインの位置に対応する断層画像を高画質化処理して得た高画質な断層画像の表示に変更されるように構成されてもよい。また、高画質化処理の対象画像毎に高画質化ボタンが表示されることで、画像毎に独立して高画質化処理可能に構成されてもよい。 At this time, for example, when the image quality enhancement button is pressed in response to an instruction from the examiner, a plurality of different types of live moving images (for example, anterior eye images, SLO images, tomographic images) are displayed ( at the same time), the display may be changed to display a high-quality moving image obtained by the high-quality image processing. At this time, the display of the high-quality moving image may be continuous display of high-quality images obtained by subjecting each frame to high-quality image processing. Further, for example, since the feature amount of an image may differ for each type of image, a trained model for image quality enhancement corresponding to each type of target image for image quality enhancement processing may be used. For example, when an image quality enhancement button is pressed in response to an instruction from the examiner, the image quality enhancement model corresponding to the anterior eye image is used to not only perform image quality enhancement processing for the anterior eye image, but also to correspond to the SLO image. The SLO image may also be configured to perform the image quality enhancement process using the image quality enhancement model. Further, for example, when an image quality enhancement button is pressed in response to an instruction from the examiner, the display is changed to a high quality anterior eye image generated using an image quality enhancement model corresponding to the anterior eye image. In addition, the display may be changed to a high-quality SLO image generated using a high-quality model corresponding to the SLO image. Further, for example, when the image quality enhancement button is pressed in response to an instruction from the examiner, not only is the image quality enhancement processing performed on the SLO image using the image quality enhancement model corresponding to the SLO image, but also the tomographic image is processed. The tomographic image may also be configured to perform image quality enhancement processing using the image quality enhancement model. Further, for example, when the image quality enhancement button is pressed in response to an instruction from the examiner, the display is simply changed to display a high image quality SLO image generated using an image quality enhancement model corresponding to the SLO image. Instead, the display may be changed to display a high-quality tomographic image generated using a high-quality model corresponding to the tomographic image. At this time, a line indicating the position of the tomographic image may be superimposed on the SLO image. Also, the line may be configured to be movable on the SLO image according to an instruction from the examiner. Further, when the display of the image quality improvement button is in an active state, after the line is moved, a high quality tomographic image obtained by performing image quality improvement processing on the tomographic image corresponding to the position of the current line is displayed. It may be configured to change the display. Further, by displaying an image quality improvement button for each image to be subjected to the image quality improvement process, the image quality improvement process may be performed independently for each image.

これにより、例えば、ライブ動画像であっても、処理時間を短縮することができるため、検者は撮影開始前に精度の高い情報を得ることができる。このため、例えば、プレビュー画面を確認しながら操作者がアライメント位置を修正する場合に、再撮影の失敗等を低減することができるため、診断の精度や効率を向上させることができる。また、画像処理装置１０１は、撮影開始に関する指示に応じて、撮影の途中あるいは撮影の最後に、セグメンテーション処理等により得たアーチファクト領域等の部分領域が再度撮影（リスキャン）されるように、上述した走査手段を駆動制御してもよい。なお、被検眼の動き等の状態によっては、１回のリスキャンでは上手く撮影できない場合があるため、所定の回数のリスキャンが繰り返されるように駆動制御されてもよい。このとき、所定の回数のリスキャンの途中でも、操作者からの指示に応じて（例えば、撮影キャンセルボタンの押下後に）リスキャンが終了されるように構成されてもよい。このとき、操作者からの指示に応じてリスキャンが終了されるまでの撮影データが保存されるように構成されてもよい。なお、例えば、撮影キャンセルボタンの押下後に確認ダイアログが表示され、撮影データの保存か、撮影データの破棄かを、操作者からの指示に応じて選択可能に構成されてもよい。また、例えば、撮影キャンセルボタンの押下後には、（現在のリスキャンは完了するまで実行されるが）次のリスキャンは実行されずに、確認ダイアログにおける操作者からの指示（入力）があるまで待機するように構成されてもよい。また、例えば、注目部位に関する物体検出結果の確からしさを示す情報（例えば、割合を示す数値）が閾値を超えた場合には、各調整や撮影開始等を自動的に行うように構成されてもよい。また、例えば、注目部位に関する物体検出結果の確からしさを示す情報（例えば、割合を示す数値）が閾値を超えた場合には、各調整や撮影開始等を検者からの指示に応じて実行可能な状態に変更（実行禁止状態を解除）するように構成されてもよい。 As a result, the processing time can be shortened even for a live moving image, for example, so that the examiner can obtain highly accurate information before the start of imaging. For this reason, for example, when the operator corrects the alignment position while checking the preview screen, it is possible to reduce the failure of re-imaging and the like, so that the accuracy and efficiency of diagnosis can be improved. In addition, the image processing apparatus 101 performs the above-described rescanning operation so that a partial area such as an artifact area obtained by segmentation processing or the like is re-captured (rescanned) during or at the end of imaging in response to an instruction regarding the start of imaging. The scanning means may be driven and controlled. Depending on the state of the subject's eye, such as the movement of the subject's eye, it may not be possible to capture an image successfully with one rescan. At this time, the rescan may be terminated even during the predetermined number of rescans according to an instruction from the operator (for example, after pressing the photographing cancel button). At this time, the imaging data may be stored until the rescan is completed in accordance with an instruction from the operator. Note that, for example, a confirmation dialog may be displayed after the photography cancel button is pressed, and it may be possible to select whether to save the photography data or discard the photography data according to an instruction from the operator. Also, for example, after pressing the shooting cancel button, the next rescan is not executed (although the current rescan is executed until it is completed), and the system waits until there is an instruction (input) from the operator in the confirmation dialog. It may be configured as Further, for example, when the information indicating the certainty of the object detection result (for example, the numerical value indicating the ratio) regarding the part of interest exceeds the threshold value, each adjustment, the start of imaging, etc. may be automatically performed. good. Also, for example, when the information indicating the certainty of the object detection result (for example, the numerical value indicating the ratio) regarding the target part exceeds the threshold, each adjustment and the start of imaging can be executed according to the instructions from the examiner. It may be configured to change to a normal state (cancel the execution prohibition state).

ここで、オートアライメント中では、被検眼２００の網膜等の撮影対象がまだ上手く撮像できていない可能性がある。このため、学習済モデルに入力される医用画像と学習データとして用いられた医用画像との違いが大きいために、精度良く高画質画像が得られない可能性がある。そこで、断層画像（Ｂスキャン画像）の画質評価等の評価値が閾値を超えたら、高画質動画像の表示（高画質フレームの連続表示）を自動的に開始するように構成してもよい。また、断層画像の画質評価等の評価値が閾値を超えたら、高画質化ボタンを検者が指定可能な状態（アクティブ状態）に変更するように構成されてもよい。なお、高画質化ボタンは、高画質化処理の実行を指定するためのボタンである。もちろん、高画質化ボタンは、高画質画像の表示を指示するためのボタンであってもよい。 Here, during the auto-alignment, there is a possibility that the object to be imaged, such as the retina of the subject's eye 200, has not yet been successfully imaged. For this reason, there is a possibility that a high-quality image cannot be obtained with high accuracy due to the large difference between the medical image input to the trained model and the medical image used as learning data. Therefore, when an evaluation value such as image quality evaluation of a tomographic image (B-scan image) exceeds a threshold value, display of high-quality moving images (continuous display of high-quality frames) may be automatically started. Further, when the evaluation value of the image quality evaluation of the tomographic image exceeds the threshold value, the image quality improvement button may be changed to a state in which the examiner can designate (active state). The image quality enhancement button is a button for specifying execution of image quality enhancement processing. Of course, the high image quality button may be a button for instructing display of a high quality image.

また、スキャンパターン等が異なる撮影モード毎に異なる高画質化モデルを用意して、選択された撮影モードに対応する高画質化用の学習済モデルが選択されるように構成されてもよい。また、異なる撮影モードで得た様々な医用画像を含む学習データを学習して得た１つの高画質化モデルが用いられてもよい。 Alternatively, different image quality enhancement models may be prepared for different image capturing modes with different scan patterns, etc., and a trained model for image quality enhancement corresponding to the selected image capturing mode may be selected. Also, one image quality enhancement model obtained by learning learning data including various medical images obtained in different imaging modes may be used.

ここで、眼科装置、例えばＯＣＴ装置では、撮影モード毎に測定に用いる光束のスキャンパターンや撮影部位が異なる。そのため、断層画像を入力データとする学習済モデルに関しては、撮影モード毎に学習済モデルを用意し、操作者の指示に応じて選択された撮影モードに対応する学習済モデルが選択されるように構成してもよい。この場合、撮影モードとしては、例えば、網膜撮影モード、前眼部撮影モード、硝子体撮影モード、黄斑部撮影モード、及び視神経乳頭部撮影モード、ＯＣＴＡ撮影モード等が含まれてよい。また、スキャンパターンとしては、３Ｄスキャン、ラジアルスキャン、クロススキャン、サークルスキャン、ラスタスキャン、及びリサージュスキャン（リサージュ曲線に沿った走査）等が含まれてよい。なお、ＯＣＴＡ撮影モードでは、被検眼の同一領域（同一位置）において測定光が複数回走査されるように、撮影制御部１０１－２が上述した走査部を制御する。ＯＣＴＡ撮影モードでも、スキャンパターンとして、例えばラスタスキャンや、ラジアルスキャン、クロススキャン、サークルスキャン、リサージュスキャン等を設定することができる。また、断層画像を入力データとする学習済モデルに関しては、異なる方向の断面に応じた断層画像を学習データに用いて学習を行うことができる。例えば、ｘｚ方向の断面の断層画像やｙｚ方向の断面の断層画像等を学習データに用いて学習を行ってよい。 Here, in an ophthalmologic apparatus, for example, an OCT apparatus, the scan pattern of the light flux used for measurement and the imaging region differ for each imaging mode. For this reason, with respect to the trained model that uses tomographic images as input data, a trained model is prepared for each imaging mode, and the trained model corresponding to the selected imaging mode is selected according to the operator's instruction. may be configured. In this case, the imaging modes may include, for example, a retinal imaging mode, an anterior segment imaging mode, a vitreous imaging mode, a macular imaging mode, an optic disc imaging mode, an OCTA imaging mode, and the like. Also, the scan pattern may include 3D scan, radial scan, cross scan, circle scan, raster scan, Lissajous scan (scan along a Lissajous curve), and the like. In the OCTA imaging mode, the imaging control unit 101-2 controls the above-described scanning unit so that the same region (same position) of the subject's eye is scanned multiple times with the measurement light. Even in the OCTA imaging mode, for example, raster scan, radial scan, cross scan, circle scan, Lissajous scan, etc. can be set as scan patterns. Also, with respect to a trained model that uses tomographic images as input data, it is possible to perform learning using tomographic images corresponding to cross sections in different directions as learning data. For example, learning may be performed using a tomographic image of a cross section in the xz direction, a tomographic image of a cross section in the yz direction, or the like as learning data.

なお、高画質化モデルによる高画質化処理の実行（又は高画質化処理して得た高画質画像の表示）の要否の判断は、表示画面に設けられる高画質化ボタンについて、操作者の指示に応じて行われてもよいし、予め記憶部１０１－３に記憶されている設定に応じて行われてもよい。なお、学習済モデル（高画質化モデル）を用いた高画質化処理である旨を高画質化ボタンのアクティブ状態等で表示してもよいし、その旨をメッセージとして表示画面に表示させてもよい。また、高画質化処理の実行は、眼科装置の前回の起動時における実行状態を維持してもよいし、被検者毎に前回の検査時の実行状態を維持してもよい。 It should be noted that the necessity of execution of image quality enhancement processing by the image quality enhancement model (or display of a high quality image obtained by image quality enhancement processing) is determined by the operator using the image quality enhancement button provided on the display screen. It may be performed in accordance with an instruction, or may be performed in accordance with settings stored in advance in storage unit 101-3. Note that the image quality improvement process using the trained model (image quality improvement model) may be displayed in the active state of the image quality improvement button, etc., or may be displayed as a message on the display screen to that effect. good. Further, the execution state of the image quality improvement process may be maintained in the execution state at the time of the previous activation of the ophthalmologic apparatus, or may be maintained in the execution state at the time of the previous examination for each subject.

また、高画質化モデル等の種々の学習済モデルを適用可能な動画像は、ライブ動画像に限らず、例えば、記憶部１０１－３に記憶（保存）された動画像であってもよい。このとき、例えば、記憶部１０１－３に記憶（保存）された眼底の断層動画像の少なくとも１つのフレーム毎に位置合わせして得た動画像が表示画面に表示されてもよい。例えば、硝子体を好適に観察したい場合には、まず、フレーム上に硝子体ができるだけ存在する等の条件を基準とする基準フレームを選択してもよい。このとき、各フレームは、ＸＺ方向の断層画像（Ｂスキャン画像）である。そして、選択された基準フレームに対して他のフレームがＸＺ方向に位置合わせされた動画像が表示画面に表示されてもよい。このとき、例えば、動画像の少なくとも１つのフレーム毎に高画質化エンジンにより順次生成された高画質画像（高画質フレーム）を連続表示させるように構成されてもよい。 Further, moving images to which various learned models such as high-quality models can be applied are not limited to live moving images, and may be, for example, moving images stored (saved) in storage unit 101-3. At this time, for example, a moving image obtained by aligning at least one frame of the tomographic moving images of the fundus stored (saved) in the storage unit 101-3 may be displayed on the display screen. For example, when the vitreous body is desired to be properly observed, first, a reference frame may be selected based on conditions such as the presence of as much vitreous body as possible on the frame. At this time, each frame is a tomographic image (B-scan image) in the XZ direction. Then, a moving image in which another frame is aligned in the XZ direction with respect to the selected reference frame may be displayed on the display screen. At this time, for example, high-quality images (high-quality frames) sequentially generated by the high-quality image engine for each at least one frame of the moving image may be continuously displayed.

なお、上述したフレーム間の位置合わせの手法としては、Ｘ方向の位置合わせの手法とＺ方向（深度方向）の位置合わせの手法とは、同じ手法が適用されても良いし、全て異なる手法が適用されてもよい。また、同一方向の位置合わせは、異なる手法で複数回行われてもよく、例えば、粗い位置合わせを行った後に、精密な位置合わせが行われてもよい。また、位置合わせの手法としては、例えば、断層画像（Ｂスキャン画像）をセグメンテーション処理して得た網膜層境界を用いた（Ｚ方向の粗い）位置合わせ、断層画像を分割して得た複数の領域と基準画像との相関情報（類似度）を用いた（Ｘ方向やＺ方向の精密な）位置合わせ、断層画像（Ｂスキャン画像）毎に生成した１次元投影像を用いた（Ｘ方向の）位置合わせ、２次元正面画像を用いた（Ｘ方向の）位置合わせ等がある。また、ピクセル単位で粗く位置合わせが行われてから、サブピクセル単位で精密な位置合わせが行われるように構成されてもよい。 As a method of aligning between frames described above, the same method may be applied to the method of aligning in the X direction and the method of aligning in the Z direction (depth direction), or different methods may be used. may be applied. Also, the alignment in the same direction may be performed multiple times by different techniques. For example, fine alignment may be performed after performing rough alignment. Alignment methods include, for example, alignment using a retinal layer boundary obtained by segmentation processing of a tomographic image (B-scan image) (rough in the Z direction), and a plurality of images obtained by dividing a tomographic image. Alignment (precise in the X direction and Z direction) using correlation information (similarity) between the region and the reference image, and one-dimensional projection image generated for each tomographic image (B scan image) (X direction ) alignment, alignment (in the X direction) using a two-dimensional front image, and the like. Also, it may be configured such that after rough alignment is performed in units of pixels, fine alignment is performed in units of sub-pixels.

また、高画質化モデルは、検者からの指示に応じて設定（変更）された割合の値を学習データとする追加学習により更新されてもよい。例えば、入力画像が比較的暗いときに、高画質画像に対する入力画像の割合を検者が高く設定する傾向にあれば、学習済モデルはそのような傾向となるように追加学習することになる。これにより、例えば、検者の好みに合った合成の割合を得ることができる学習済モデルとしてカスタマイズすることができる。このとき、設定（変更）された割合の値を追加学習の学習データとして用いるか否かを、検者からの指示に応じて決定するためのボタンが表示画面に表示されていてもよい。また、学習済モデルを用いて決定された割合をデフォルトの値とし、その後、検者からの指示に応じて割合の値をデフォルトの値から変更可能となるように構成されてもよい。また、高画質化モデルは、高画質化モデルを用いて生成された少なくとも１つの高画質画像を含む学習データを追加学習して得た学習済モデルであってもよい。このとき、高画質画像を追加学習用の学習データとして用いるか否かを、検者からの指示により選択可能に構成されてもよい。 Further, the high image quality model may be updated by additional learning using the value of the ratio set (changed) according to the instruction from the examiner as the learning data. For example, when the input image is relatively dark, if the examiner tends to set a high ratio of the input image to the high-quality image, the learned model undergoes additional learning so as to achieve such a tendency. As a result, for example, it can be customized as a trained model that can obtain a combination ratio that suits the examiner's taste. At this time, a button may be displayed on the display screen for determining whether or not to use the set (changed) ratio value as learning data for additional learning in accordance with an instruction from the examiner. Alternatively, the ratio determined using the trained model may be set as the default value, and thereafter the ratio value may be changed from the default value in accordance with an instruction from the examiner. Also, the high image quality model may be a trained model obtained by additionally learning learning data including at least one high quality image generated using the high image quality model. At this time, whether or not to use the high-quality image as learning data for additional learning may be selectable by an instruction from the examiner.

［変形例４］
画像特徴取得部１０１－４４、抽出部１０１－４６２、及び領域検出部３０２は、画像セグメンテーション用の学習済モデルを用いてラベル画像を生成し、画像セグメンテーション処理を行ってもよい。ここでラベル画像とは、当該断層画像について画素毎に領域のラベルが付されたラベル画像をいう。具体的には、取得された画像に描出されている領域群のうち、任意の領域を特定可能な画素値（以下、ラベル値）群によって分けている画像のことである。ここで、特定される任意の領域には関心領域や関心体積（ＶＯＩ：ＶｏｌｕｍｅＯｆＩｎｔｅｒｅｓｔ）等が含まれる。 [Modification 4]
The image feature acquisition unit 101-44, the extraction unit 101-462, and the region detection unit 302 may generate label images using a trained model for image segmentation and perform image segmentation processing. Here, the label image means a label image in which a region label is attached to each pixel of the tomographic image. Specifically, it is an image in which arbitrary regions are divided by a group of identifiable pixel values (hereinafter referred to as label values) from among the regions drawn in the acquired image. Here, the specified arbitrary region includes a region of interest, a volume of interest (VOI), and the like.

画像から任意のラベル値を持つ画素の座標群を特定すると、画像中において対応する網膜層等の領域を描出している画素の座標群を特定できる。具体的には、例えば、網膜を構成する神経節細胞層を示すラベル値が１である場合、画像の画素群のうち画素値が１である座標群を特定し、画像から該座標群に対応する画素群を抽出する。これにより、当該画像における神経節細胞層の領域を特定できる。 By specifying a coordinate group of pixels having arbitrary label values from the image, it is possible to specify a coordinate group of pixels that render a corresponding region such as a retinal layer in the image. Specifically, for example, when the label value indicating the ganglion cell layer that constitutes the retina is 1, a coordinate group having a pixel value of 1 among the pixel groups of the image is specified, and the coordinates corresponding to the coordinate group are identified from the image. Extract the pixel group that Thereby, the region of the ganglion cell layer in the image can be specified.

なお、画像セグメンテーション処理には、ラベル画像に対する縮小又は拡大処理を実施する処理が含まれてもよい。このとき、ラベル画像の縮小又は拡大に用いる画像補完処理手法は、未定義のラベル値や対応する座標に存在しないはずのラベル値を誤って生成しないような、最近傍法等を使うものとする。 Note that the image segmentation process may include a process of reducing or enlarging the label image. At this time, the image interpolation processing method used to reduce or enlarge the label image shall use the nearest neighbor method, etc., so as not to erroneously generate an undefined label value or a label value that should not exist at the corresponding coordinates. .

画像セグメンテーション処理とは、画像に描出された臓器や病変といった、ＲＯＩ（ＲｅｇｉｏｎＯｆＩｎｔｅｒｅｓｔ）やＶＯＩと呼ばれる領域を、画像診断や画像解析に利用するために特定する処理のことである。例えば、画像セグメンテーション処理によれば、後眼部を撮影対象としたＯＣＴの撮影によって取得された画像から、網膜を構成する層群の領域群を特定することができる。なお、画像に特定すべき領域が描出されていなければ特定される領域の数は０である。また、画像に特定すべき複数の領域群が描出されていれば、特定される領域の数は複数であってもよいし、又は、該領域群を含むように囲む領域１つであってもよい。 The image segmentation process is a process of specifying a region called ROI (Region Of Interest) or VOI, such as an organ or lesion depicted in an image, for use in image diagnosis or image analysis. For example, according to the image segmentation process, it is possible to specify a region group of a group of layers forming the retina from an image obtained by OCT imaging of the posterior segment of the eye. Note that the number of specified regions is 0 if the region to be specified is not rendered in the image. Also, if a plurality of region groups to be specified are drawn in the image, the number of specified regions may be plural, or even if there is only one region surrounding the region group. good.

特定された領域群は、その他の処理において利用可能な情報として出力される。具体的には、例えば、特定された領域群のそれぞれを構成する画素群の座標群を数値データ群として出力することができる。また、例えば、特定された領域群のそれぞれを含む矩形領域や楕円領域、長方体領域、楕円体領域等を示す座標群を数値データ群として出力することもできる。さらに、例えば、特定された領域群の境界にあたる直線や曲線、平面、又は曲面等を示す座標群を数値データ群として出力することもできる。また、例えば、特定された領域群を示すラベル画像を出力することもできる。 The identified region group is output as information that can be used in other processing. Specifically, for example, it is possible to output a group of coordinates of a group of pixels forming each of the identified region groups as a group of numerical data. Further, for example, a group of coordinates indicating a rectangular area, an elliptical area, a rectangular parallelepiped area, an ellipsoidal area, etc., including each of the identified area groups can be output as a numerical data group. Furthermore, for example, it is possible to output a group of coordinates indicating a straight line, a curve, a plane, a curved surface, or the like, which is the boundary of the specified area group, as a numerical data group. Also, for example, it is possible to output a label image indicating the identified region group.

ここで、画像セグメンテーション用の機械学習モデルとしては、例えば、畳み込みニューラルネットワーク（ＣＮＮ）を用いることができる。なお、本変形例で用いるＣＮＮの構成は、複数のダウンサンプリング層を含む複数の階層からなるエンコーダーの機能と、複数のアップサンプリング層を含む複数の階層からなるデコーダーの機能とを有するＵ－ｎｅｔ型の機械学習モデルとすることができる。Ｕ－ｎｅｔ型の機械学習モデルでは、エンコーダーとして構成される複数の階層において曖昧にされた位置情報（空間情報）を、デコーダーとして構成される複数の階層において、同次元の階層（互いに対応する階層）で用いることができるように（例えば、スキップコネクションを用いて）構成される。 Here, for example, a convolutional neural network (CNN) can be used as a machine learning model for image segmentation. It should be noted that the configuration of the CNN used in this modification is the function of an encoder consisting of multiple layers including multiple downsampling layers, and the function of a decoder consisting of multiple layers including multiple upsampling layers. It can be a machine learning model of type. In the U-net type machine learning model, position information (spatial information) obscured in multiple layers configured as encoders is converted to the same dimensional layers (mutually corresponding layers) in multiple layers configured as decoders. ) (eg, using a skip connection).

また、ＣＮＮの構成の変更例として、例えば、畳み込み層の後にバッチ正規化（ＢａｔｃｈＮｏｒｍａｌｉｚａｔｉｏｎ）層や、正規化線形関数（ＲｅｃｔｉｆｉｅｒＬｉｎｅａｒＵｎｉｔ）を用いた活性化層を組み込む等をしてもよい。ＣＮＮのこれらのステップを通して、撮影画像の特徴を抽出することができる。 Further, as a modification of the configuration of the CNN, for example, a batch normalization layer or an activation layer using a rectifier linear unit may be incorporated after the convolutional layer. Through these steps of CNN, the features of the captured image can be extracted.

なお、本変形例に係る機械学習モデルとしては、例えば、ＣＮＮ（Ｕ－ｎｅｔ型の機械学習モデル）、ＣＮＮとＬＳＴＭを組み合わせたモデル、ＦＣＮ（ＦｕｌｌｙＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｔｗｏｒｋ）、又はＳｅｇＮｅｔ等を用いることができる。また、所望の構成に応じて、物体認識を行う機械学習モデル等を用いることもできる。物体認識を行う機械学習モデルとしては、例えば、ＲＣＮＮ（ＲｅｇｉｏｎＣＮＮ）、ｆａｓｔＲＣＮＮ、又はｆａｓｔｅｒＲＣＮＮを用いることができる。さらに、領域単位で物体認識を行う機械学習モデルを用いることもできる。領域単位で物体認識を行う機械学習モデルとしては、ＹＯＬＯ（ＹｏｕＯｎｌｙＬｏｏｋＯｎｃｅ）、又はＳＳＤ（ＳｉｎｇｌｅＳｈｏｔＤｅｔｅｃｔｏｒ、あるいはＳｉｎｇｌｅＳｈｏｔＭｕｌｔｉＢｏｘＤｅｔｅｃｔｏｒ）を用いることもできる。 As the machine learning model according to this modification, for example, CNN (U-net type machine learning model), a model combining CNN and LSTM, FCN (Fully Convolutional Network), or SegNet can be used. . A machine learning model or the like that performs object recognition can also be used depending on the desired configuration. As a machine learning model for object recognition, for example, RCNN (Region CNN), fastRCNN, or fasterRCNN can be used. Furthermore, it is also possible to use a machine learning model that performs object recognition on a region-by-region basis. YOLO (You Only Look Once) or SSD (Single Shot Detector or Single Shot MultiBox Detector) can also be used as a machine learning model that performs object recognition on a region-by-region basis.

また、画像セグメンテーション用の機械学習モデルの学習データは、ＯＣＴにより取得された断層画像を入力データとし、当該断層画像について画素毎に領域のラベルが付されたラベル画像を出力データとする。ラベル画像としては、例えば、内境界膜（ＩＬＭ）、神経線維層（ＮＦＬ）、神経節細胞層（ＧＣＬ）、視細胞内節外節接合部（ＩＳＯＳ）、網膜色素上皮層（ＲＰＥ）、ブルッフ膜（ＢＭ）、及び脈絡膜等のラベルが付されたラベル画像を用いることができる。なお、その他の領域として、例えば、硝子体、強膜、外網状層（ＯＰＬ）、外顆粒層（ＯＮＬ）、内網状層（ＩＰＬ）、内顆粒層（ＩＮＬ）、角膜、前房、虹彩、及び水晶体等のラベルが付された画像を用いてもよい。 In addition, learning data for a machine learning model for image segmentation uses a tomographic image obtained by OCT as input data, and outputs a labeled image in which a region is labeled for each pixel of the tomographic image. Label images include, for example, the inner limiting membrane (ILM), nerve fiber layer (NFL), ganglion cell layer (GCL), photoreceptor inner segment outer segment junction (ISOS), retinal pigment epithelium layer (RPE), Bruch Label images with labels such as membrane (BM) and choroid can be used. Other regions include, for example, vitreous body, sclera, outer plexiform layer (OPL), outer nuclear layer (ONL), inner plexiform layer (IPL), inner nuclear layer (INL), cornea, anterior chamber, iris, and labeled images such as lenses may be used.

また、画像セグメンテーション用の機械学習モデルの入力データは断層画像に限られない。前眼画像やＳＬＯ画像、ＯＣＴＡ画像等であってもよい。この場合、学習データは、各種画像を入力データとし、各種画像の画素毎に領域名等がラベル付けされたラベル画像を出力データとすることができる。例えば、学習データの入力データがＳＬＯ画像である場合には、出力データは、視神経乳頭の周辺部、Ｄｉｓｃ、及びＣｕｐ等のラベルが付された画像であってよい。 Input data for a machine learning model for image segmentation is not limited to tomographic images. It may be an anterior eye image, an SLO image, an OCTA image, or the like. In this case, the learning data can use various images as input data and output data as label images in which each pixel of each image is labeled with a region name or the like. For example, if the input data of the learning data are SLO images, the output data may be images labeled with the periphery of the optic disc, Disc, Cup, and the like.

なお、出力データとして用いられるラベル画像は、医師等により断層画像において各領域にラベルが付された画像であってもよいし、ルールベースの領域検出処理により各領域にラベルが付された画像であってもよい。ただし、適切にラベル付けが行われていないラベル画像を学習データの出力データとして用いて機械学習を行うと、当該学習データを用いて学習した学習済モデルを用いて得た画像も適切にラベル付けが行われていないラベル画像となってしまう可能性がある。そのため、そのようなラベル画像を含むペアを学習データから取り除くことで、学習済モデルを用いて適切でないラベル画像が生成される可能性を低減させることができる。ここで、ルールベースの領域検出処理とは、例えば網膜の形状の規則性等の既知の規則性を利用した検出処理をいう。 The label image used as the output data may be an image in which each region in the tomographic image is labeled by a doctor or the like, or an image in which each region is labeled by rule-based region detection processing. There may be. However, if machine learning is performed using labeled images that have not been properly labeled as output data for training data, images obtained using a trained model that has been trained using the training data will also be properly labeled. There is a possibility that it will be a label image that has not been Therefore, by removing pairs including such label images from the training data, it is possible to reduce the possibility that inappropriate label images are generated using the trained model. Here, the rule-based area detection processing refers to detection processing using known regularity such as the regularity of the shape of the retina.

画像特徴取得部１０１－４４、抽出部１０１－４６２、及び領域検出部３０２は、このような画像セグメンテーション用の学習済モデルを用いて、画像セグメンテーション処理を行うことで、各種画像について特定の領域を高速に精度良く検出することが期待できる。なお、画像セグメンテーション用の学習済モデルも、入力データである各種画像の種類毎に用意されてもよい。また、ＯＣＴＡ正面画像やＥｎ－Ｆａｃｅ画像については、画像を生成するための深度範囲毎に学習済モデルが用意されてもよい。さらに、画像セグメンテーション用の学習済モデルも、撮影部位（例えば、黄斑部中心、視神経乳頭部中心）毎の画像について学習を行ったものでもよいし、撮影部位を関わらず学習を行ったものであってもよい。 The image feature acquisition unit 101-44, the extraction unit 101-462, and the area detection unit 302 perform image segmentation processing using such a trained model for image segmentation, thereby identifying specific areas of various images. High-speed and accurate detection can be expected. A trained model for image segmentation may also be prepared for each type of image that is input data. Also, for the OCTA front image and the En-Face image, a trained model may be prepared for each depth range for generating the image. Furthermore, the trained model for image segmentation may be one that has been trained on images for each imaging region (for example, the center of the macula, the center of the optic papilla), or one that has been trained regardless of the imaging region. may

また、画像セグメンテーション用の学習済モデルについては、操作者の指示に応じて手動で修正されたデータを学習データとして追加学習が行われてもよい。また、追加学習の要否の判断やサーバにデータを送信するか否かの判断も同様の方法で行われてよい。これらの場合にも、各処理の精度を向上させたり、検者の好みの傾向に応じた処理を行えたりすることが期待できる。 In addition, with respect to the trained model for image segmentation, additional learning may be performed using data manually corrected in accordance with an operator's instruction as learning data. In addition, determination of necessity of additional learning and determination of whether or not to transmit data to the server may be made in the same manner. In these cases as well, it is expected that the accuracy of each process can be improved, and that processes can be performed according to the tendency of the examiner's preferences.

さらに、画像処理装置１０１は、学習済モデルを用いて、被検眼２００の部分領域（例えば、注目部位、アーチファクト領域、異常部位等）を検出する場合には、検出した部分領域毎に所定の画像処理を施すこともできる。例として、硝子体領域、網膜領域、及び脈絡膜領域のうちの少なくとも２つの部分領域を検出する場合について述べる。この場合には、検出された少なくとも２つの部分領域に対してコントラスト調整等の画像処理を施す際に、それぞれ異なる画像処理のパラメータを用いることで、各領域に適した調整を行うことができる。各領域に適した調整が行われた画像を表示することで、操作者は部分領域毎の疾病等をより適切に診断することができる。なお、検出された部分領域毎に異なる画像処理のパラメータを用いる構成については、学習済モデルを用いずに被検眼２００の部分領域を検出して求めた被検眼２００の部分領域について同様に適用されてもよい。 Furthermore, when the image processing apparatus 101 detects a partial region (for example, a region of interest, an artifact region, an abnormal region, etc.) of the subject's eye 200 using a trained model, the image processing apparatus 101 obtains a predetermined image for each detected partial region. It can also be processed. As an example, a case of detecting at least two partial regions of the vitreous body region, the retinal region, and the choroidal region will be described. In this case, when image processing such as contrast adjustment is performed on at least two detected partial regions, by using different image processing parameters, adjustment suitable for each region can be performed. By displaying an image adjusted appropriately for each region, the operator can more appropriately diagnose a disease or the like for each partial region. Note that the configuration using different image processing parameters for each detected partial region is similarly applied to the partial region of the eye 200 to be examined that is obtained by detecting the partial region of the eye 200 to be examined without using the learned model. may

［変形例５］
上述した様々な実施形態及び変形例における表示制御部１０１－５は、断層画像撮影後に表示画面のレポート画面において、所望の層の層厚や各種の血管密度等の解析結果を表示させてもよい。また、視神経乳頭部、黄斑部、血管領域、毛細血管領域、動脈領域、静脈領域、神経線維束、硝子体領域、黄斑領域、脈絡膜領域、強膜領域、篩状板領域、網膜層境界、網膜層境界端部、視細胞、血球、血管壁、血管内壁境界、血管外側境界、神経節細胞、角膜領域、隅角領域、シュレム管等の少なくとも１つを含む注目部位に関するパラメータの値（分布）を解析結果として表示させてもよい。ここで、注目部位は、例えば、Ｈａｌｌｅｒ層における血管（脈絡膜領域の一部の深度範囲における血管の一例）の眼外への流出口である渦静脈等であってもよい。このとき、注目部位に関するパラメータは、例えば、渦静脈の個数（例えば、領域毎の個数）や、視神経乳頭部から各渦静脈までの距離、視神経乳頭を中心とする各渦静脈の位置する角度等であってもよい。これにより、例えば、Ｐａｃｈｙｃｈｏｒｏｉｄ（肥厚した脈絡膜）に関する種々の疾患（例えば、脈絡膜新生血管症）等を精度よく診断することが可能となる。また、例えば、各種のアーチファクトの低減処理が適用された医用画像を解析することで、上述した種々の解析結果を精度の良い解析結果として表示させることができる。なお、アーチファクトは、例えば、血管領域等による光吸収により生じる偽像領域や、プロジェクションアーチファクト、被検眼の状態（動きや瞬き等）によって測定光の主走査方向に生じる正面画像における帯状のアーチファクト等であってもよい。また、アーチファクトは、例えば、被検者の所定部位の医用画像上に撮影毎にランダムに生じるような写損領域であれば、何でもよい。また、表示制御部１０１－５は、上述したような様々なアーチファクト（写損領域）の少なくとも１つを含む領域に関するパラメータの値（分布）を解析結果として出力部１０３に表示させてもよい。また、ドルーゼン、新生血管、白斑（硬性白斑）、及びシュードドルーゼン等の異常部位等の少なくとも１つを含む領域に関するパラメータの値（分布）を解析結果として表示させてもよい。また、標準データベースを用いて得た標準値や標準範囲と、解析結果とを比較して得た比較結果が表示されてもよい。 [Modification 5]
The display control unit 101-5 in the various embodiments and modifications described above may display analysis results such as desired layer thicknesses and various blood vessel densities on the report screen of the display screen after tomographic imaging. . Also, optic nerve head, macula, blood vessel area, capillary area, artery area, vein area, nerve fiber bundle, vitreous area, macular area, choroid area, scleral area, cribriform plate area, retinal layer boundary, retina Values (distribution) of parameters related to the site of interest including at least one of layer boundary edge, photoreceptor, blood cell, blood vessel wall, blood vessel inner wall boundary, blood vessel outer boundary, ganglion cell, corneal area, angle area, Schlemm's canal, etc. may be displayed as the analysis result. Here, the site of interest may be, for example, a vortex vein or the like, which is an outflow port of a blood vessel in the Haller layer (an example of a blood vessel in a partial depth range of the choroidal region) to the outside of the eye. At this time, the parameters related to the region of interest include, for example, the number of vortex veins (for example, the number for each region), the distance from the optic disc to each vortex vein, and the angle at which each vortex vein is positioned around the optic disc. may be As a result, for example, various diseases related to Pachychoroid (thickened choroid) (for example, choroidal neovascular disease) can be accurately diagnosed. Further, for example, by analyzing medical images to which various types of artifact reduction processing have been applied, the various analysis results described above can be displayed as highly accurate analysis results. Artifacts include, for example, false image areas caused by light absorption by blood vessel areas, projection artifacts, and strip-shaped artifacts in the front image that occur in the main scanning direction of the measurement light due to the state of the subject's eye (movement, blinking, etc.). There may be. Also, the artifact may be anything, for example, as long as it is an image failure area that occurs randomly in each imaging on a medical image of a predetermined region of the subject. In addition, the display control unit 101-5 may cause the output unit 103 to display the values (distribution) of the parameters regarding the area including at least one of the various artifacts (imaging area) as described above as the analysis result. Also, parameter values (distribution) relating to an area including at least one of drusen, new blood vessels, vitiligo (hard vitiligo), and abnormal sites such as pseudodrusen may be displayed as the analysis result. In addition, comparison results obtained by comparing standard values and standard ranges obtained using a standard database with analysis results may be displayed.

また、解析結果は、解析マップや、各分割領域に対応する統計値を示すセクター等で表示されてもよい。なお、解析結果は、医用画像の解析結果を学習データとして学習して得た学習済モデル（解析結果生成エンジン、解析結果生成用の学習済モデル）を用いて生成されたものであってもよい。このとき、学習済モデルは、医用画像とその医用画像の解析結果とを含む学習データや、医用画像とその医用画像とは異なる種類の医用画像の解析結果とを含む学習データ等を用いた学習により得たものであってもよい。 Also, the analysis result may be displayed as an analysis map, a sector indicating a statistical value corresponding to each divided area, or the like. Note that the analysis results may be generated using a trained model (analysis result generation engine, trained model for generating analysis results) obtained by learning analysis results of medical images as learning data. . At this time, the trained model is learned using learning data including medical images and analysis results of the medical images, or learning data including medical images and analysis results of medical images of a different type from the medical images. It may be obtained by

また、画像解析を行うための学習データは、画像セグメンテーション処理用の学習済モデルを用いて生成されたラベル画像と、当該ラベル画像を用いた医用画像の解析結果とを含んだものでもよい。この場合、画像処理装置１０１は、例えば、解析結果生成用の学習済モデルを用いて、画像セグメンテーション処理の結果から、断層画像の解析結果を生成する、解析結果生成部の一例として機能することができる。さらに、学習済モデルは、輝度のＥｎ－Ｆａｃｅ画像及びモーションコントラスト正面画像（ＯＣＴＡのＥｎ－Ｆａｃｅ画像）のように、所定部位の異なる種類の複数の医用画像をセットとする入力データを含む学習データを用いた学習により得たものであってもよい。 Also, the learning data for image analysis may include labeled images generated using a trained model for image segmentation processing and analysis results of medical images using the labeled images. In this case, the image processing apparatus 101 can function as an example of an analysis result generation unit that generates an analysis result of a tomographic image from the result of image segmentation processing using a trained model for generating analysis results, for example. can. Furthermore, the trained model is training data including input data that is a set of a plurality of different types of medical images of a predetermined site, such as brightness En-Face images and motion contrast frontal images (OCTA En-Face images). may be obtained by learning using

また、高画質化モデルを用いて生成された高画質画像を用いて得た解析結果が表示されるように構成されてもよい。この場合、学習データに含まれる入力データとしては、高画質化用の学習済モデルを用いて生成された高画質画像であってもよいし、低画質画像と高画質画像とのセットであってもよい。なお、学習データは、学習済モデルを用いて高画質化された画像について、手動又は自動で少なくとも一部に修正が施された画像であってもよい。 Further, it may be configured to display analysis results obtained using a high-quality image generated using a high-quality image model. In this case, the input data included in the learning data may be a high-quality image generated using a trained model for improving image quality, or a set of a low-quality image and a high-quality image. good too. Note that the learning data may be an image obtained by manually or automatically correcting at least a part of an image whose image quality has been improved using a trained model.

また、学習データは、例えば、解析領域を解析して得た解析値（例えば、平均値や中央値等）、解析値を含む表、解析マップ、画像におけるセクター等の解析領域の位置等の少なくとも１つを含む情報を（教師あり学習の）正解データとして、入力データにラベル付け（アノテーション）したデータであってもよい。なお、操作者からの指示に応じて、解析結果生成用の学習済モデルを用いて得た解析結果が表示されるように構成されてもよい。 In addition, the learning data includes, for example, an analysis value obtained by analyzing the analysis area (e.g., average value, median value, etc.), a table containing the analysis value, an analysis map, the position of the analysis area such as a sector in the image, etc. It may be data obtained by labeling (annotating) input data with information including one as correct data (for supervised learning). Note that the analysis results obtained using the learned model for generating analysis results may be displayed according to instructions from the operator.

また、上述した実施形態及び変形例における表示制御部１０１－５は、表示画面のレポート画面において、糖尿病網膜症や、緑内障、加齢黄斑変性症等の種々の診断結果を表示させてもよい。このとき、例えば、上述したような各種のアーチファクトの低減処理が適用された医用画像を解析することで、精度の良い診断結果を表示させることができる。また、診断結果は、特定された異常部位等の位置を画像上に表示されてもよいし、異常部位の状態等を文字等によって表示されてもよい。さらに、異常部位等の分類結果（例えば、カーティン分類）を診断結果として表示させてもよい。また、分類結果としては、例えば、異常部位毎の確からしさを示す情報（例えば、割合を示す数値）が表示されてもよい。また、医師が診断を確定させる上で必要な情報が診断結果として表示されてもよい。上記必要な情報としては、例えば、追加撮影等のアドバイスが考えられる。例えば、ＯＣＴＡ画像における血管領域に異常部位が検出された場合には、ＯＣＴＡよりも詳細に血管を観察可能な造影剤を用いた蛍光撮影を追加で行う旨が表示されてもよい。また、診断結果は、被検者の今後の診療方針等に関する情報であってもよい。また、診断結果は、例えば、診断名、病変（異常部位）の種類や状態（程度）、画像における病変の位置、注目領域に対する病変の位置、所見（読影所見等）、診断名の根拠（肯定的な医用支援情報等）、及び診断名を否定する根拠（否定的な医用支援情報）等の少なくとも１つを含む情報であってもよい。このとき、例えば、検者からの指示に応じて入力された診断名等の診断結果よりも確からしい診断結果を医用支援情報として表示させてもよい。また、複数の種類の医用画像が用いられた場合には、例えば、診断結果の根拠となり得る種類の医用画像が識別可能に表示されてもよい。また、診断結果の根拠としては、学習済モデルが抽出した特徴量を可視化したマップ（注意マップ、活性化マップ）で、例えば、特徴量をカラーで示したカラーマップ（ヒートマップ）であってもよい。このとき、例えば、入力データとした医用画像にヒートマップを重畳表示させてもよい。なお、ヒートマップは、例えば、予測（推定）されるクラスの出力値への寄与が大きい領域（勾配が大きい領域）を可視化する手法であるＧｒａｄ－ＣＡＭ（Ｇｒａｄｉｅｎｔ－ｗｅｉｇｈｔｅｄＣｌａｓｓＡｃｔｉｖａｔｉｏｎＭａｐｐｉｎｇ）やＧｕｉｄｅｄＧｒａｄ－ＣＡＭ等を用いて得ることができる。 Further, the display control unit 101-5 in the above-described embodiment and modification may display various diagnostic results such as diabetic retinopathy, glaucoma, age-related macular degeneration, etc. on the report screen of the display screen. At this time, for example, by analyzing a medical image to which various artifact reduction processes as described above have been applied, it is possible to display a highly accurate diagnosis result. Further, the diagnosis result may display the position of the identified abnormal site or the like on an image, or may display the state of the abnormal site or the like in characters or the like. Further, classification results (for example, Curtin classification) such as abnormal sites may be displayed as diagnosis results. Further, as the classification result, for example, information indicating the likelihood of each abnormal site (for example, a numerical value indicating a ratio) may be displayed. Information necessary for the doctor to confirm the diagnosis may also be displayed as the diagnosis result. As the necessary information, for example, advice such as additional photographing can be considered. For example, when an abnormal site is detected in a blood vessel region in an OCTA image, it may be displayed that fluorescence imaging using a contrast agent that enables observation of blood vessels in more detail than OCTA is additionally performed. Further, the diagnosis result may be information related to the subject's future medical treatment policy and the like. In addition, the diagnosis results include, for example, the diagnosis name, the type and state (degree) of the lesion (abnormal site), the position of the lesion in the image, the position of the lesion with respect to the region of interest, findings (interpretation findings, etc.), the basis for the diagnosis name (positive (negative medical support information, etc.) and grounds for denying the diagnosis (negative medical support information). At this time, for example, a diagnosis result that is more likely than the diagnosis result such as a diagnosis name input in response to an instruction from the examiner may be displayed as the medical support information. In addition, when a plurality of types of medical images are used, for example, the types of medical images that can serve as the basis for the diagnosis result may be displayed in an identifiable manner. In addition, as the basis for the diagnosis result, a map (attention map, activation map) that visualizes the feature amount extracted by the trained model, for example, a color map (heat map) that shows the feature amount in color good. At this time, for example, a heat map may be superimposed on the medical image used as the input data. The heat map is, for example, Grad-CAM (Gradient-weighted Class Activation Mapping) or Guided Grad - can be obtained using CAM or the like;

なお、診断結果は、医用画像の診断結果を学習データとして学習して得た学習済モデル（診断結果生成エンジン、診断結果生成用の学習済モデル）を用いて生成されたものであってもよい。また、学習済モデルは、医用画像とその医用画像の診断結果とを含む学習データや、医用画像とその医用画像とは異なる種類の医用画像の診断結果とを含む学習データ等を用いた学習により得たものであってもよい。 The diagnosis result may be generated using a trained model (a diagnosis result generating engine, a trained model for generating diagnosis results) obtained by learning the diagnosis results of medical images as learning data. . In addition, the trained model is obtained by learning using learning data including medical images and diagnostic results of the medical images, learning data including medical images and diagnostic results of medical images of a different type from the medical images, and the like. It may be obtained.

また、学習データは、領域認識エンジンやセグメンテーション処理用の学習済モデルを用いて生成されたラベル画像と、当該ラベル画像を用いた医用画像の診断結果とを含んだものでもよい。この場合、画像処理装置１０１は、例えば、診断結果生成用の学習済モデルを用いて、画像セグメンテーション処理の結果から、断層画像の診断結果を生成する、診断結果生成部の一例として機能することができる。 Also, the learning data may include labeled images generated using a region recognition engine or a trained model for segmentation processing, and diagnostic results of medical images using the labeled images. In this case, the image processing apparatus 101 can function as an example of a diagnostic result generation unit that generates a diagnostic result of a tomographic image from the result of image segmentation processing using a trained model for generating a diagnostic result. can.

さらに、高画質化エンジンを用いて生成された高画質画像を用いて得た診断結果が表示されるように構成されてもよい。この場合、学習データに含まれる入力データとしては、高画質化エンジンを用いて生成された高画質画像であってもよいし、低画質画像と高画質画像とのセットであってもよい。なお、学習データは、学習済モデルを用いて高画質化された画像について、手動又は自動で少なくとも一部に修正が施された画像であってもよい。 Furthermore, it may be configured to display the diagnosis result obtained using the high-quality image generated using the high-quality image engine. In this case, the input data included in the learning data may be a high-quality image generated using a high-quality image engine, or may be a set of a low-quality image and a high-quality image. Note that the learning data may be an image obtained by manually or automatically correcting at least a part of an image whose image quality has been improved using a trained model.

また、学習データは、例えば、診断名、病変（異常部位）の種類や状態（程度）、画像における病変の位置、注目領域に対する病変の位置、所見（読影所見等）、診断名の根拠（肯定的な医用支援情報等）、診断名を否定する根拠（否定的な医用支援情報）等の少なくとも１つを含む情報を（教師あり学習の）正解データとして、入力データにラベル付け（アノテーション）したデータを用いてもよい。なお、検者からの指示に応じて、診断結果生成用の学習済モデルを用いて得た診断結果が表示されるように構成されてもよい。 In addition, the learning data includes, for example, the diagnosis name, the type and state (degree) of the lesion (abnormal site), the position of the lesion in the image, the position of the lesion with respect to the region of interest, findings (interpretation findings, etc.), the basis for the diagnosis name (positive The input data is labeled (annotated) as correct data (supervised learning) that includes at least one of (such as medical support information) and grounds for denying the diagnosis (negative medical support information). Data may be used. Note that the diagnostic results obtained using the learned model for generating diagnostic results may be displayed according to instructions from the examiner.

なお、入力データとして用いる情報毎又は情報の種類毎に学習済モデルを用意し、学習済モデルを用いて、診断結果を取得してもよい。この場合、各学習済モデルから出力された情報に統計的な処理を行い、最終的な診断結果を決定してもよい。例えば、各学習済モデルから出力された情報の割合を各種類の情報毎に加算し、他の情報よりも割合の合計が高い情報を最終的な診断結果として決定してもよい。なお、統計的な処理は合計の算出に限られず、平均値や中央値の算出等であってもよい。また、例えば、各学習済モデルから出力された情報のうち、他の情報よりも割合の高い情報（最も割合の高い情報）を用いて診断結果を決定してもよい。同様に、各学習済モデルから出力された情報のうち、閾値以上である割合の情報を用いて診断結果を決定してもよい。 Note that a trained model may be prepared for each piece of information used as input data or for each type of information, and a diagnosis result may be acquired using the trained model. In this case, the information output from each trained model may be statistically processed to determine the final diagnostic result. For example, the ratio of information output from each trained model may be added for each type of information, and information with a higher total ratio than other information may be determined as the final diagnosis result. Statistical processing is not limited to calculation of the total, and may be calculation of an average value, a median value, or the like. Further, for example, among the information output from each trained model, information with a higher percentage than other information (information with the highest percentage) may be used to determine the diagnosis result. Similarly, out of the information output from each trained model, the information of the ratio of the threshold value or more may be used to determine the diagnostic result.

また、操作者の指示（選択）に応じて、決定された診断結果の良否の判定（承認）が可能に構成されてもよい。また、操作者の指示（選択）に応じて、各学習済モデルから出力された情報から診断結果を決定してもよい。このとき、例えば、表示制御部１０１－５が、各学習済モデルから出力された情報及びその割合を並べて出力部１０３に表示させてもよい。そして、操作者が、例えば、他の情報よりも割合の高い情報を選択することにより、選択された情報を診断結果として決定するように構成されてもよい。さらに、各学習済モデルから出力された情報から、機械学習モデルを用いて、診断結果を決定してもよい。この場合には、機械学習アルゴリズムとして、診断結果生成に用いられた機械学習アルゴリズムとは異なる種類の機械学習アルゴリズムであってもよく、例えば、ニューラルネットワーク、サポートベクターマシン、アダブースト、ベイジアンネットワーク、又はランダムフォレスト等を用いてよい。 Further, it may be possible to determine (approve) whether the determined diagnosis result is good or bad according to the operator's instruction (selection). Further, the diagnosis result may be determined from the information output from each learned model according to the instruction (selection) of the operator. At this time, for example, the display control unit 101-5 may cause the output unit 103 to display the information output from each trained model and the ratio thereof side by side. The selected information may be determined as the diagnosis result by the operator selecting information with a higher percentage than other information, for example. Furthermore, a diagnosis result may be determined using a machine learning model from the information output from each trained model. In this case, the machine learning algorithm may be a different type of machine learning algorithm than the machine learning algorithm used to generate the diagnosis, such as neural networks, support vector machines, Adaboost, Bayesian networks, or random Forrest or the like may be used.

なお、上述した種々の学習済モデルの学習は、教師あり学習（ラベル付きの学習データで学習）だけでなく、半教師あり学習であってもよい。半教師あり学習は、例えば、複数の識別器（分類器）がそれぞれ教師あり学習を行った後、ラベルのない学習データを識別（分類）し、識別結果（分類結果）の信頼度に応じて（例えば、確からしさが閾値以上の識別結果を）自動的にラベル付け（アノテーション）し、ラベル付けされた学習データで学習を行う手法である。半教師あり学習は、例えば、共訓練（Ｃｏ－Ｔｒａｉｎｉｎｇ、あるいはＭｕｌｔｉｖｉｅｗ）であってもよい。このとき、診断結果生成用の学習済モデルは、例えば、正常な被検体の医用画像を識別する第１の識別器と、特定の病変を含む医用画像を識別する第２の識別器とを用いて半教師あり学習（例えば、共訓練）して得た学習済モデルであってもよい。なお、診断目的に限らず、例えば撮影支援等を目的としてもよい。この場合、第２の識別器は、例えば、注目部位やアーチファクト領域等の部分領域を含む医用画像を識別するものであってもよい。 The learning of the various trained models described above may be not only supervised learning (learning using labeled learning data) but also semi-supervised learning. In semi-supervised learning, for example, after multiple discriminators (classifiers) perform supervised learning, they identify (classify) unlabeled learning data, and according to the reliability of the classification result (classification result) This is a method of automatically labeling (annotating) (for example, identification results whose certainty is greater than a threshold) and performing learning using the labeled learning data. Semi-supervised learning may be, for example, Co-Training (or Multiview). At this time, the trained model for generating the diagnosis result uses, for example, a first classifier that identifies a medical image of a normal subject and a second classifier that identifies a medical image containing a specific lesion. It may also be a trained model obtained by semi-supervised learning (eg, co-training). It should be noted that the purpose is not limited to diagnosis, and may be, for example, an imaging support or the like. In this case, the second discriminator may, for example, discriminate a medical image including a partial region such as a region of interest or an artifact region.

また、上述した様々な実施形態及び変形例に係る表示制御部１０１－５は、表示画面のレポート画面において、上述したような注目部位、アーチファクト領域、及び異常部位等の部分領域の物体認識結果（物体検出結果）やセグメンテーション結果を表示させてもよい。このとき、例えば、画像上の物体の周辺に矩形の枠等を重畳して表示させてもよい。また、例えば、画像における物体上に色等を重畳して表示させてもよい。なお、物体認識結果やセグメンテーション結果は、物体認識やセグメンテーションを示す情報を正解データとして医用画像にラベル付け（アノテーション）した学習データを学習して得た学習済モデル（物体認識エンジン、物体認識用の学習済モデル、セグメンテーションエンジン、セグメンテーション用の学習済モデル）を用いて生成されたものであってもよい。なお、上述した解析結果生成や診断結果生成は、上述した物体認識結果やセグメンテーション結果を利用することで得られたものであってもよい。例えば、物体認識やセグメンテーションの処理により得た注目部位に対して解析結果生成や診断結果生成の処理を行ってもよい。 Further, the display control unit 101-5 according to the various embodiments and modifications described above displays the object recognition result ( object detection results) or segmentation results may be displayed. At this time, for example, a rectangular frame or the like may be superimposed and displayed around the object on the image. Further, for example, a color or the like may be superimposed on the object in the image and displayed. The object recognition results and segmentation results are the trained models (object recognition engine, object recognition engine) obtained by learning learning data in which medical images are labeled (annotated) with information indicating object recognition and segmentation as correct data. A trained model, a segmentation engine, a trained model for segmentation) may be used. Note that the analysis result generation and diagnosis result generation described above may be obtained by using the object recognition result and segmentation result described above. For example, analysis result generation and diagnosis result generation processing may be performed on a region of interest obtained by object recognition or segmentation processing.

また、異常部位を検出する場合には、画像処理装置１０１は、敵対的生成ネットワーク（ＧＡＮ：ＧｅｎｅｒａｔｉｖｅＡｄｖｅｒｓａｒｉａｌＮｅｔｗｏｋｓ）や変分オートエンコーダー（ＶＡＥ：ＶａｒｉａｔｉｏｎａｌＡｕｔｏ－Ｅｎｃｏｄｅｒ）を用いてもよい。例えば、医用画像の生成を学習して得た生成器と、生成器が生成した新たな医用画像と本物の医用画像との識別を学習して得た識別器とからなるＤＣＧＡＮ（ＤｅｅｐＣｏｎｖｏｌｕｔｉｏｎａｌＧＡＮ）を機械学習モデルとして用いることができる。 When detecting an abnormal site, the image processing apparatus 101 may use a generative adversarial network (GAN) or a variational auto-encoder (VAE). For example, a DCGAN (Deep Convolutional GAN) consisting of a generator obtained by learning to generate medical images and a discriminator obtained by learning to discriminate between new medical images generated by the generator and real medical images. can be used as a machine learning model.

ＤＣＧＡＮを用いる場合には、例えば、識別器が入力された医用画像をエンコードすることで潜在変数にし、生成器が潜在変数に基づいて新たな医用画像を生成する。その後、入力された医用画像と生成された新たな医用画像との差分を異常部位として抽出（検出）することができる。また、ＶＡＥを用いる場合には、例えば、入力された医用画像をエンコーダーによりエンコードすることで潜在変数にし、潜在変数をデコーダーによりデコードすることで新たな医用画像を生成する。その後、入力された医用画像と生成された新たな医用画像像との差分を異常部位として抽出することができる。 In the case of using DCGAN, for example, the discriminator encodes an input medical image into a latent variable, and the generator generates a new medical image based on the latent variable. After that, the difference between the input medical image and the generated new medical image can be extracted (detected) as an abnormal site. When VAE is used, for example, an input medical image is encoded by an encoder to generate a latent variable, and a decoder decodes the latent variable to generate a new medical image. After that, the difference between the input medical image and the generated new medical image can be extracted as an abnormal site.

さらに、画像処理装置１０１は、畳み込みオートエンコーダー（ＣＡＥ：ＣｏｎｖｏｌｕｔｉｏｎａｌＡｕｔｏ－Ｅｎｃｏｄｅｒ）を用いて、異常部位を検出してもよい。ＣＡＥを用いる場合には、学習時に入力データ及び出力データとして同じ医用画像を学習させる。これにより、推定時に異常部位がある医用画像をＣＡＥに入力すると、学習の傾向に従って異常部位がない医用画像が出力される。その後、ＣＡＥに入力された医用画像とＣＡＥから出力された医用画像の差分を異常部位として抽出することができる。 Furthermore, the image processing device 101 may detect an abnormal site using a convolutional auto-encoder (CAE). When CAE is used, the same medical image is learned as input data and output data during learning. As a result, when a medical image with an abnormal portion is input to CAE at the time of estimation, a medical image without an abnormal portion is output according to the tendency of learning. After that, the difference between the medical image input to CAE and the medical image output from CAE can be extracted as an abnormal site.

これらの場合、画像処理装置１０１は、敵対的生成ネットワーク又はオートエンコーダーを用いて得た医用画像と、該敵対的生成ネットワーク又はオートエンコーダーに入力された医用画像との差に関する情報を異常部位に関する情報として生成することができる。これにより、画像処理装置１０１は、高速に精度よく異常部位を検出することが期待できる。例えば、異常部位の検出精度の向上のために異常部位を含む医用画像を学習データとして数多く集めることが難しい場合であっても、比較的に数多く集め易い正常な被検体の医用画像を学習データとして用いることができる。このため、例えば、異常部位を精度よく検出するための学習を効率的に行うことができる。ここで、オートエンコーダーには、ＶＡＥやＣＡＥ等が含まれる。また、敵対的生成ネットワークの生成部の少なくとも一部がＶＡＥで構成されてもよい。これにより、例えば、同じようなデータを生成してしまう現象を低減しつつ、比較的鮮明な画像を生成することができる。例えば、画像処理装置１０１は、種々の医用画像から敵対的生成ネットワーク又はオートエンコーダーを用いて得た医用画像と、該敵対的生成ネットワーク又は該オートエンコーダーに入力された医用画像との差に関する情報を、異常部位に関する情報として生成することができる。また、例えば、表示制御部１０１－５は、種々の医用画像から敵対的生成ネットワーク又はオートエンコーダーを用いて得た医用画像と、該敵対的生成ネットワーク又は該オートエンコーダーに入力された医用画像との差に関する情報を、異常部位に関する情報として出力部１０３に表示させることができる。 In these cases, the image processing apparatus 101 converts information about the difference between the medical image obtained using the hostile generation network or the autoencoder and the medical image input to the hostile generation network or the autoencoder into information about the abnormal site. can be generated as As a result, the image processing apparatus 101 can be expected to detect an abnormal site at high speed and with high accuracy. For example, even if it is difficult to collect a large number of medical images containing abnormal regions as learning data in order to improve the detection accuracy of abnormal regions, medical images of normal subjects, which are relatively easy to collect, can be used as learning data. can be used. For this reason, for example, learning for accurately detecting an abnormal site can be performed efficiently. Here, autoencoders include VAE, CAE, and the like. Also, at least a part of the generation unit of the adversarial generation network may be composed of VAEs. As a result, for example, a relatively clear image can be generated while reducing the phenomenon of generating similar data. For example, the image processing apparatus 101 obtains information about the difference between a medical image obtained from various medical images using an adversarial generation network or an autoencoder and a medical image input to the adversarial generation network or the autoencoder. , can be generated as information about the abnormal site. Further, for example, the display control unit 101-5 controls the display of a medical image obtained from various medical images using a hostile generation network or an autoencoder and a medical image input to the hostile generation network or the autoencoder. Information about the difference can be displayed on the output unit 103 as information about the abnormal site.

また、特に診断結果生成用の学習済モデルは、被検者の所定部位の異なる種類の複数の医用画像をセットとする入力データを含む学習データにより学習して得た学習済モデルであってもよい。このとき、学習データに含まれる入力データとして、例えば、眼底のモーションコントラスト正面画像及び輝度正面画像（あるいは輝度断層画像）をセットとする入力データが考えられる。また、学習データに含まれる入力データとして、例えば、眼底の断層画像（Ｂスキャン画像）及びカラー眼底画像（あるいは蛍光眼底画像）をセットとする入力データ等も考えられる。また、異なる種類の複数の医療画像は、異なるモダリティ、異なる光学系、又は異なる原理等により取得されたものであれば何でもよい。 In addition, the trained model for generating diagnostic results in particular may be a trained model obtained by learning using learning data including input data that is a set of a plurality of different types of medical images of a predetermined region of a subject. good. At this time, the input data included in the learning data may be, for example, input data that is a set of a motion contrast front image and a luminance front image (or a luminance tomographic image) of the fundus. As input data included in the learning data, for example, input data such as a set of a fundus tomographic image (B-scan image) and a color fundus image (or a fluorescent fundus image) can be considered. Moreover, the multiple medical images of different types may be acquired by different modalities, different optical systems, or different principles.

また、特に診断結果生成用の学習済モデルは、被検者の異なる部位の複数の医用画像をセットとする入力データを含む学習データにより学習して得た学習済モデルであってもよい。このとき、学習データに含まれる入力データとして、例えば、眼底の断層画像（Ｂスキャン画像）と前眼部の断層画像（Ｂスキャン画像）とをセットとする入力データが考えられる。また、学習データに含まれる入力データとして、例えば、眼底の黄斑の三次元ＯＣＴ画像（三次元断層画像）と眼底の視神経乳頭のサークルスキャン（又はラスタスキャン）断層画像とをセットとする入力データ等も考えられる。 In addition, the trained model for diagnosis result generation in particular may be a trained model obtained by learning using learning data including input data that is a set of a plurality of medical images of different parts of the subject. At this time, as the input data included in the learning data, for example, input data that is a set of a tomographic image (B-scan image) of the fundus and a tomographic image (B-scan image) of the anterior segment can be considered. In addition, as input data included in the learning data, for example, input data such as a set of a three-dimensional OCT image (three-dimensional tomographic image) of the macula of the fundus and a circle scan (or raster scan) tomographic image of the optic papilla of the fundus. is also conceivable.

なお、学習データに含まれる入力データは、被検者の異なる部位及び異なる種類の複数の医用画像であってもよい。このとき、学習データに含まれる入力データは、例えば、前眼部の断層画像とカラー眼底画像とをセットとする入力データ等が考えられる。また、上述した学習済モデルは、被検者の所定部位の異なる撮影画角の複数の医用画像をセットとする入力データを含む学習データにより学習して得た学習済モデルであってもよい。また、学習データに含まれる入力データは、パノラマ画像のように、所定部位を複数領域に時分割して得た複数の医用画像を貼り合わせたものであってもよい。このとき、パノラマ画像のような広画角画像を学習データとして用いることにより、狭画角画像よりも情報量が多い等の理由から画像の特徴量を精度良く取得できる可能性があるため、処理の結果を向上することができる。また、学習データに含まれる入力データは、被検者の所定部位の異なる日時の複数の医用画像をセットとする入力データであってもよい。 The input data included in the learning data may be a plurality of medical images of different regions and different types of the subject. At this time, the input data included in the learning data may be, for example, input data that is a set of a tomographic image of the anterior segment and a color fundus image. Further, the above-described trained model may be a trained model obtained by learning using learning data including input data that is a set of a plurality of medical images of a predetermined part of the subject with different imaging angles of view. The input data included in the learning data may be obtained by pasting together a plurality of medical images obtained by time-dividing a predetermined site into a plurality of regions, such as a panorama image. At this time, by using a wide-angle image such as a panoramic image as training data, it is possible to acquire the feature amount of the image with high accuracy because the amount of information is larger than that of a narrow-angle image. results can be improved. Also, the input data included in the learning data may be input data that is a set of a plurality of medical images of a predetermined part of the subject taken on different dates.

また、上述した解析結果と診断結果と物体認識結果とセグメンテーション結果とのうち少なくとも１つの結果が表示される表示画面は、レポート画面に限らない。このような表示画面は、例えば、撮影確認画面、経過観察用の表示画面、及び撮影前の各種調整用のプレビュー画面（各種のライブ動画像が表示される表示画面）等の少なくとも１つの表示画面に表示されてもよい。例えば、上述した学習済モデルを用いて得た上記少なくとも１つの結果を撮影確認画面に表示させることにより、操作者は、撮影直後であっても精度の良い結果を確認することができる。 Further, the display screen on which at least one of the above-described analysis result, diagnosis result, object recognition result, and segmentation result is displayed is not limited to the report screen. Such a display screen is, for example, at least one display screen such as a shooting confirmation screen, a display screen for follow-up observation, and a preview screen for various adjustments before shooting (a display screen on which various live moving images are displayed). may be displayed in For example, by displaying at least one result obtained using the above-described learned model on the photographing confirmation screen, the operator can confirm a highly accurate result even immediately after photographing.

また、例えば、特定の物体が認識されると、認識された物体を囲う枠がライブ動画像に重畳表示させるように構成されてもよい。このとき、物体認識結果の確からしさを示す情報（例えば、割合を示す数値）が閾値を超えた場合には、例えば、物体を囲う枠の色が変更される等のように強調表示されてもよい。これにより、検者は、物体をライブ動画上で容易に識別することができる。 Further, for example, when a specific object is recognized, a frame surrounding the recognized object may be superimposed on the live moving image. At this time, if the information indicating the certainty of the object recognition result (for example, a numerical value indicating the ratio) exceeds the threshold value, the color of the frame surrounding the object is changed, for example. good. This allows the examiner to easily identify the object on the live video.

なお、上述した様々な学習済モデルの学習に用いられる正解データの生成には、ラベル付け（アノテーション）等の正解データを生成するための正解データ生成用の学習済モデルが用いられてもよい。このとき、正解データ生成用の学習済モデルは、検者がラベル付け（アノテーション）して得た正解データを（順次）追加学習することにより得られたものであってもよい。すなわち、正解データ生成用の学習済モデルは、ラベル付け前のデータを入力データとし、ラベル付け後のデータを出力データとする学習データを追加学習することにより得られたものであってもよい。また、動画像等のような連続する複数フレームにおいて、前後のフレームの物体認識やセグメンテーション等の結果を考慮して、結果の精度が低いと判定されたフレームの結果を修正するように構成されてもよい。このとき、検者からの指示に応じて、修正後の結果を正解データとして追加学習するように構成されてもよい。また、例えば、結果の精度が低い医用画像については、検者が該医用画像上に、学習済モデルが抽出した特徴量を可視化したマップ（注意マップ、活性化マップ）の一例である、特徴量をカラーで示したカラーマップ（ヒートマップ）を確認しながらラベル付け（アノテーション）した画像を入力データとして追加学習するように構成されてもよい。例えば、学習済モデルにおける結果を出力する直前等のレイヤー上のヒートマップにおいて、注目すべき箇所が検者の意図と異なる場合には、検者が注目すべきと考える箇所にラベル付け（アノテーション）した医用画像を追加学習してもよい。これにより、例えば、学習済モデルは、医用画像上の部分領域であって、学習済モデルの出力結果に対して比較的影響が大きな部分領域の特徴量を、他の領域よりも優先して（重みを付けて）追加学習することができる。 In addition, a trained model for correct data generation for generating correct data such as labeling (annotation) may be used to generate correct data used for learning of the various trained models described above. At this time, the trained model for correct data generation may be obtained by (sequentially) additionally learning the correct data obtained by labeling (annotating) by the examiner. In other words, the trained model for correct data generation may be obtained by additionally learning learning data in which data before labeling is used as input data and data after labeling is used as output data. In addition, in a plurality of continuous frames such as a moving image, it is configured to correct the result of a frame determined to have low accuracy in consideration of the results of object recognition, segmentation, etc. of the preceding and succeeding frames. good too. At this time, the corrected results may be used as correct data for additional learning in accordance with instructions from the examiner. In addition, for example, for a medical image with a low result accuracy, the examiner can visualize the feature extracted by the trained model on the medical image (attention map, activation map). may be configured such that additional learning is performed using labeled (annotated) images as input data while confirming a color map (heat map) showing in color. For example, in the heat map on the layer immediately before outputting the results of the trained model, if the points of interest differ from the intention of the examiner, label (annotate) the points that the examiner thinks should be noted. Additional learning may be performed on the medical images that have been acquired. As a result, for example, the trained model prioritizes the feature amount of a partial region that is a partial region on a medical image and has a relatively large effect on the output result of the trained model over other regions ( weighted) can be additionally learned.

ここで、上述した様々な学習済モデルは、学習データを用いた機械学習により得ることができる。機械学習には、例えば、多階層のニューラルネットワークから成る深層学習（ＤｅｅｐＬｅａｒｎｉｎｇ）がある。また、多階層のニューラルネットワークの少なくとも一部には、例えば、畳み込みニューラルネットワークを用いることができる。また、多階層のニューラルネットワークの少なくとも一部には、オートエンコーダー（自己符号化器）に関する技術が用いられてもよい。また、学習には、バックプロパゲーション（誤差逆伝搬法）に関する技術が用いられてもよい。また、学習には、各ユニット（各ニューロン、あるいは各ノード）をランダムに不活性化する手法（ドロップアウト）が用いられてもよい。また、学習には、多階層のニューラルネットワークの各層に伝わったデータを、活性化関数（例えばＲｅＬｕ関数）が適用される前に、正規化する手法（バッチ正規化）が用いられてもよい。ただし、機械学習としては、深層学習に限らず、画像等の学習データの特徴量を学習によって自ら抽出（表現）可能なモデルを用いた学習であれば何でもよい。ここで、機械学習モデルとは、ディープラーニング等の機械学習アルゴリズムによる学習モデルをいう。また、学習済モデルとは、任意の機械学習アルゴリズムによる機械学習モデルに対して、事前に適切な学習データを用いてトレーニングした（学習を行った）モデルである。ただし、学習済モデルは、それ以上の学習を行わないものではなく、追加の学習を行うこともできるものとする。また、学習データとは、入力データ及び出力データ（正解データ）のペアで構成される。ここで、学習データを教師データという場合もあるし、あるいは、正解データを教師データという場合もある。 Here, the various trained models described above can be obtained by machine learning using learning data. Machine learning includes, for example, deep learning consisting of multilevel neural networks. Also, for example, a convolutional neural network can be used for at least part of the multi-layered neural network. Also, at least a part of the multi-layered neural network may employ a technology related to an autoencoder. Also, a technique related to back propagation (error backpropagation method) may be used for learning. Also, for learning, a method (dropout) of randomly inactivating each unit (each neuron or each node) may be used. Also, for learning, a method (batch normalization) of normalizing data transmitted to each layer of a multi-layer neural network before an activation function (for example, ReLu function) is applied may be used. However, machine learning is not limited to deep learning, and any learning using a model capable of extracting (expressing) feature amounts of learning data such as images by learning can be used. Here, the machine learning model refers to a learning model based on a machine learning algorithm such as deep learning. Also, a trained model is a model that has been trained (learned) in advance using appropriate learning data for a machine learning model based on an arbitrary machine learning algorithm. However, it is assumed that the trained model is not one that does not perform further learning, and that additional learning can be performed. Also, learning data is composed of a pair of input data and output data (correct data). Here, learning data may be referred to as teacher data, or correct data may be referred to as teacher data.

なお、ＧＰＵは、データをより多く並列処理することで効率的な演算を行うことができる。このため、ディープラーニングのような学習モデルを用いて複数回に渡り学習を行う場合には、ＧＰＵで処理を行うことが有効である。そこで、本変形例では、学習部（不図示）の一例である画像処理装置１０１による処理には、ＣＰＵに加えてＧＰＵを用いる。具体的には、学習モデルを含む学習プログラムを実行する場合に、ＣＰＵとＧＰＵが協働して演算を行うことで学習を行う。なお、学習部の処理は、ＣＰＵ又はＧＰＵのみにより演算が行われてもよい。また、上述した様々な学習済モデルを用いた処理を実行する処理部（推定部）も、学習部と同様にＧＰＵを用いてもよい。また、学習部は、不図示の誤差検出部と更新部とを備えてもよい。誤差検出部は、入力層に入力される入力データに応じてニューラルネットワークの出力層から出力される出力データと、正解データとの誤差を得る。誤差検出部は、損失関数を用いて、ニューラルネットワークからの出力データと正解データとの誤差を計算するようにしてもよい。また、更新部は、誤差検出部で得られた誤差に基づいて、その誤差が小さくなるように、ニューラルネットワークのノード間の結合重み付け係数等を更新する。この更新部は、例えば、誤差逆伝播法を用いて、結合重み付け係数等を更新する。誤差逆伝播法は、上記の誤差が小さくなるように、各ニューラルネットワークのノード間の結合重み付け係数等を調整する手法である。 Note that the GPU can perform efficient calculations by processing more data in parallel. Therefore, when learning is performed multiple times using a learning model such as deep learning, it is effective to perform processing using a GPU. Therefore, in this modification, the GPU is used in addition to the CPU for processing by the image processing apparatus 101, which is an example of a learning unit (not shown). Specifically, when a learning program including a learning model is executed, the CPU and the GPU cooperate to perform calculations for learning. Note that the processing of the learning unit may be performed by only the CPU or GPU. Also, a processing unit (estimating unit) that executes processing using various learned models described above may also use a GPU, like the learning unit. Also, the learning unit may include an error detection unit and an updating unit (not shown). The error detection unit obtains an error between correct data and output data output from the output layer of the neural network according to input data input to the input layer. The error detector may use a loss function to calculate the error between the output data from the neural network and the correct data. Also, the updating unit updates the weighting coefficients for coupling between nodes of the neural network based on the error obtained by the error detecting unit so as to reduce the error. This updating unit updates the connection weighting coefficients and the like using, for example, the error backpropagation method. The error backpropagation method is a method of adjusting the connection weighting coefficients and the like between nodes of each neural network so as to reduce the above error.

また、上述した物体認識や、セグメンテーション、高画質化等に用いられる機械学習モデルとしては、複数のダウンサンプリング層を含む複数の階層からなるエンコーダーの機能と、複数のアップサンプリング層を含む複数の階層からなるデコーダーの機能とを有するＵ－ｎｅｔ型の機械学習モデルが適用可能である。Ｕ－ｎｅｔ型の機械学習モデルでは、エンコーダーとして構成される複数の階層において曖昧にされた位置情報（空間情報）を、デコーダーとして構成される複数の階層において、同次元の階層（互いに対応する階層）で用いることができるように（例えば、スキップコネクションを用いて）構成される。 In addition, the machine learning model used for object recognition, segmentation, high image quality, etc. described above includes an encoder function consisting of multiple layers including multiple downsampling layers, and multiple layers including multiple upsampling layers. A U-net type machine learning model having a decoder function consisting of is applicable. In the U-net type machine learning model, position information (spatial information) obscured in multiple layers configured as encoders is converted to the same dimensional layers (mutually corresponding layers) in multiple layers configured as decoders. ) (eg, using a skip connection).

また、上述した物体認識や、セグメンテーション、高画質化等に用いられる機械学習モデルとしては、例えば、ＦＣＮ（ＦｕｌｌｙＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｔｗｏｒｋ）、又はＳｅｇＮｅｔ等を用いることもできる。また、所望の構成に応じて領域単位で物体認識を行う機械学習モデルを用いてもよい。物体認識を行う機械学習モデルとしては、例えば、ＲＣＮＮ（ＲｅｇｉｏｎＣＮＮ）、ｆａｓｔＲＣＮＮ、又はｆａｓｔｅｒＲＣＮＮを用いることができる。さらに、領域単位で物体認識を行う機械学習モデルとして、ＹＯＬＯ（ＹｏｕＯｎｌｙＬｏｏｋＯｎｃｅ）、又はＳＳＤ（ＳｉｎｇｌｅＳｈｏｔＤｅｔｅｃｔｏｒ、あるいはＳｉｎｇｌｅＳｈｏｔＭｕｌｔｉＢｏｘＤｅｔｅｃｔｏｒ）を用いることもできる。 As a machine learning model used for object recognition, segmentation, image quality improvement, and the like, for example, FCN (Fully Convolutional Network), SegNet, or the like can be used. Also, a machine learning model that performs object recognition on a region-by-region basis according to a desired configuration may be used. As a machine learning model for object recognition, RCNN (Region CNN), fastRCNN, or fasterRCNN, for example, can be used. Furthermore, YOLO (You Only Look Once) or SSD (Single Shot Detector or Single Shot MultiBox Detector) can also be used as a machine learning model for recognizing objects in units of regions.

また、機械学習モデルは、例えば、カプセルネットワーク（ＣａｐｓｕｌｅＮｅｔｗｏｒｋ；ＣａｐｓＮｅｔ）でもよい。ここで、一般的なニューラルネットワークでは、各ユニット（各ニューロン、あるいは各ノード）はスカラー値を出力するように構成されることによって、例えば、画像における特徴間の空間的な位置関係（相対位置）に関する空間情報が低減されるように構成されている。これにより、例えば、画像の局所的な歪みや平行移動等の影響が低減されるような学習を行うことができる。一方、カプセルネットワークでは、各ユニット（各カプセル）は空間情報をベクトルとして出力するように構成されることよって、例えば、空間情報が保持されるように構成されている。これにより、例えば、画像における特徴間の空間的な位置関係が考慮されたような学習を行うことができる。 Also, the machine learning model may be, for example, a Capsule Network (CapsNet). Here, in a general neural network, each unit (each neuron or each node) is configured to output a scalar value, for example, the spatial positional relationship (relative position) between features in an image is configured to reduce spatial information about As a result, for example, learning can be performed in which the effects of local distortion, translation, and the like of an image are reduced. On the other hand, in a capsule network, each unit (each capsule) is configured to output spatial information as a vector, thereby retaining spatial information. As a result, for example, learning can be performed in consideration of the spatial positional relationship between features in the image.

［変形例６］
上述した様々な実施形態及び変形例におけるプレビュー画面において、ライブ動画像の少なくとも１つのフレーム毎に上述した種々の学習済モデルが用いられるように構成されてもよい。このとき、プレビュー画面において、異なる部位や異なる種類の複数のライブ動画像が表示されている場合には、各ライブ動画像に対応する学習済モデルが用いられるように構成されてもよい。これにより、例えば、ライブ動画像であっても、処理時間を短縮することができるため、検者は撮影開始前に精度の高い情報を得ることができる。このため、例えば、再撮影の失敗等を低減することができるため、診断の精度や効率を向上させることができる。 [Modification 6]
The preview screens in the various embodiments and modifications described above may be configured so that the various learned models described above are used for at least one frame of the live moving image. At this time, when a plurality of live moving images of different parts or different types are displayed on the preview screen, the learned model corresponding to each live moving image may be used. As a result, the processing time can be shortened even for a live moving image, for example, so that the examiner can obtain highly accurate information before the start of imaging. For this reason, for example, failures in re-imaging can be reduced, so that accuracy and efficiency of diagnosis can be improved.

なお、複数のライブ動画像は、例えば、ＸＹＺ方向のアライメントのための前眼部の動画像、及び眼底観察光学系のフォーカス調整やＯＣＴフォーカス調整のための眼底の正面動画像であってよい。また、複数のライブ動画像は、例えば、ＯＣＴのコヒーレンスゲート調整（測定光路長と参照光路長との光路長差の調整）のための眼底の断層動画像等であってもよい。このようなプレビュー画像が表示される場合、上述した物体認識用の学習済モデルやセグメンテーション用の学習済モデルを用いて検出された領域が所定の条件を満たすように、上述した各種調整が行われるように画像処理装置１０１を構成してもよい。例えば、物体認識用の学習済モデルやセグメンテーション用の学習済モデルを用いて検出された硝子体領域やＲＰＥ等の所定の網膜層等に関する値（例えば、コントラスト値あるいは強度値）が閾値を超える（あるいはピーク値になる）ように、ＯＣＴフォーカス調整等の各種調整が行われるように構成されてもよい。また、例えば、物体認識用の学習済モデルやセグメンテーション用の学習済モデルを用いて検出された硝子体領域やＲＰＥ等の所定の網膜層が深さ方向における所定の位置になるように、ＯＣＴのコヒーレンスゲート調整が行われるように構成されてもよい。 Note that the plurality of live moving images may be, for example, a moving image of the anterior segment for alignment in the XYZ directions, and a front moving image of the fundus for focus adjustment of the fundus oculi observation optical system and OCT focus adjustment. Also, the plurality of live moving images may be, for example, tomographic moving images of the fundus for coherence gate adjustment of OCT (adjustment of the optical path length difference between the measurement optical path length and the reference optical path length). When such a preview image is displayed, the above-described various adjustments are performed so that the region detected using the above-described trained model for object recognition and the trained model for segmentation satisfies predetermined conditions. The image processing apparatus 101 may be configured as follows. For example, a value (e.g., contrast value or intensity value) related to a predetermined retinal layer such as a vitreous region or RPE detected using a trained model for object recognition or a trained model for segmentation exceeds a threshold ( Alternatively, it may be configured such that various adjustments such as OCT focus adjustment are performed so as to achieve a peak value. Further, for example, the OCT is performed so that a predetermined retinal layer such as the vitreous region and the RPE detected using a trained model for object recognition or a trained model for segmentation is positioned at a predetermined position in the depth direction. Coherence gate adjustments may be configured to occur.

これらの場合には、画像処理装置１０１は、学習済モデルを用いて、動画像について高画質化処理を行って、高画質な動画像を生成することができる。また、撮影制御部１０１－２は、高画質な動画像が表示された状態で、セグメンテーション処理等により得た注目部位等の部分領域が表示領域における所定の位置になるように、参照ミラー２２１の撮影範囲を変更するための光学部材を駆動制御することができる。このような場合には、撮影制御部１０１－２は、精度の高い情報に基づいて、所望される領域が表示領域の所定の位置になるように自動的にアライメント処理を行うことができる。なお、撮影範囲を変更する光学部材としては、例えばコヒーレンスゲート位置を調整する光学部材であってよく、具体的には参照光を反射する参照ミラー２２１であってよい。また、コヒーレンスゲート位置は、測定光路長及び参照光路長の光路長差を変更する光学部材によって調整されることができ、当該光学部材は、例えば、不図示の測定光の光路長を変更するためのミラー等であってもよい。なお、撮影範囲を変更する光学部材は、例えばステージ部１００－２であってもよい。また、撮影制御部１０１－２、撮影開始に関する指示に応じて、撮影の途中あるいは撮影の最後に、セグメンテーション処理等により得たアーチファクト領域等の部分領域が再度撮影（リスキャン）されるように、走査手段を駆動制御してもよい。また、例えば、注目部位に関する物体認識結果の確からしさを示す情報（例えば、割合を示す数値）が閾値を超えた場合には、各種調整や撮影開始等を自動的に行うように構成されてもよい。また、例えば、注目部位に関する物体認識結果の確からしさを示す情報（例えば、割合を示す数値）が閾値を超えた場合には、各調整や撮影開始等を検者からの指示に応じて実行可能な状態に変更（実行禁止状態を解除）するように構成されてもよい。 In these cases, the image processing apparatus 101 can generate a high-quality moving image by performing image quality enhancement processing on the moving image using the learned model. Further, the imaging control unit 101-2 moves the reference mirror 221 so that the partial region such as the region of interest obtained by the segmentation process or the like is positioned at a predetermined position in the display region while the high-quality moving image is being displayed. An optical member for changing the imaging range can be driven and controlled. In such a case, the imaging control unit 101-2 can automatically perform alignment processing so that the desired area is at a predetermined position in the display area based on highly accurate information. The optical member that changes the imaging range may be, for example, an optical member that adjusts the coherence gate position, and more specifically, the reference mirror 221 that reflects the reference light. Also, the coherence gate position can be adjusted by an optical member that changes the optical path length difference between the measurement optical path length and the reference optical path length. , or the like. Note that the optical member that changes the imaging range may be, for example, the stage section 100-2. In addition, the imaging control unit 101-2 performs scanning so that a partial area such as an artifact area obtained by segmentation processing or the like is re-captured (rescanned) during or at the end of imaging in response to an instruction regarding the start of imaging. You may drive-control a means. Further, for example, when the information indicating the certainty of the object recognition result (for example, the numerical value indicating the ratio) regarding the target site exceeds the threshold value, various adjustments and the start of imaging may be automatically performed. good. Also, for example, when the information indicating the certainty of the object recognition result (for example, the numerical value indicating the ratio) regarding the target part exceeds the threshold, each adjustment and the start of imaging can be executed according to the instructions from the examiner. It may be configured to change to a normal state (cancel the execution prohibition state).

また、上述した種々の学習済モデルを適用可能な動画像は、ライブ動画像に限らず、例えば、記憶部１０１－３に記憶（保存）された動画像であってもよい。このとき、例えば、記憶部１０１－３に記憶（保存）された眼底の断層動画像の少なくとも１つのフレーム毎に位置合わせして得た動画像が表示画面に表示されてもよい。例えば、硝子体を好適に観察したい場合には、まず、フレーム上に硝子体ができるだけ存在する等の条件を基準とする基準フレームを選択してもよい。このとき、各フレームは、ＸＺ方向の断層画像（Ｂスキャン像）である。そして、選択された基準フレームに対して他のフレームがＸＺ方向に位置合わせされた動画像が表示画面に表示されてもよい。このとき、例えば、動画像の少なくとも１つのフレーム毎に高画質化用の学習済モデルにより順次生成された高画質画像（高画質フレーム）を連続表示させるように構成してもよい。 Further, moving images to which various learned models described above can be applied are not limited to live moving images, and may be, for example, moving images stored (saved) in storage unit 101-3. At this time, for example, a moving image obtained by aligning at least one frame of the tomographic moving images of the fundus stored (saved) in the storage unit 101-3 may be displayed on the display screen. For example, when the vitreous body is desired to be properly observed, first, a reference frame may be selected based on conditions such as the presence of as much vitreous body as possible on the frame. At this time, each frame is a tomographic image (B-scan image) in the XZ direction. Then, a moving image in which another frame is aligned in the XZ direction with respect to the selected reference frame may be displayed on the display screen. At this time, for example, high-quality images (high-quality frames) sequentially generated by a trained model for improving image quality may be continuously displayed for each at least one frame of a moving image.

なお、上述したフレーム間の位置合わせの手法としては、Ｘ方向の位置合わせの手法とＺ方向（深度方向）の位置合わせの手法とは、同じ手法が適用されてもよいし、全て異なる手法が適用されてもよい。また、同一方向の位置合わせは、異なる手法で複数回行われてもよく、例えば、粗い位置合わせを行った後に、精密な位置合わせが行われてもよい。また、位置合わせの手法としては、例えば、断層画像（Ｂスキャン像）をセグメンテーション処理して得た網膜層境界を用いた（Ｚ方向の粗い）位置合わせ、断層画像を分割して得た複数の領域と基準画像との相関情報（類似度）を用いた（Ｘ方向やＺ方向の精密な）位置合わせ、断層画像（Ｂスキャン像）毎に生成した１次元投影像を用いた（Ｘ方向の）位置合わせ、２次元正面画像を用いた（Ｘ方向の）位置合わせ等がある。また、ピクセル単位で粗く位置合わせが行われてから、サブピクセル単位で精密な位置合わせが行われるように構成されてもよい。 As a method for aligning between frames described above, the same method may be applied to the method of aligning in the X direction and the method of aligning in the Z direction (depth direction), or different methods may be used. may be applied. Also, the alignment in the same direction may be performed multiple times by different techniques. For example, fine alignment may be performed after performing rough alignment. Alignment techniques include, for example, alignment using a retinal layer boundary obtained by segmentation processing of a tomographic image (B-scan image) (rough in the Z direction), and a plurality of images obtained by dividing a tomographic image. Alignment (precise in the X and Z directions) using correlation information (similarity) between the region and the reference image, and one-dimensional projection images generated for each tomographic image (B scan image) (X direction ) alignment, alignment (in the X direction) using a two-dimensional front image, and the like. Also, it may be configured such that after rough alignment is performed in units of pixels, fine alignment is performed in units of sub-pixels.

ここで、各種の調整中では、被検眼の網膜等の撮影対象がまだ上手く撮像できていない可能性がある。このため、学習済モデルに入力される医用画像と学習データとして用いられた医用画像との違いが大きいために、精度良く高画質画像が得られない可能性がある。そこで、断層画像（Ｂスキャン）の画質評価等の評価値が閾値を超えたら、高画質動画像の表示（高画質フレームの連続表示）を自動的に開始するように構成してもよい。また、断層画像（Ｂスキャン）の画質評価等の評価値が閾値を超えたら、高画質化ボタンを検者が指定可能な状態（アクティブ状態）に変更するように構成されてもよい。 Here, during various adjustments, there is a possibility that the object to be imaged, such as the retina of the subject's eye, has not yet been successfully imaged. For this reason, there is a possibility that a high-quality image cannot be obtained with high accuracy due to the large difference between the medical image input to the trained model and the medical image used as learning data. Therefore, when an evaluation value such as image quality evaluation of a tomographic image (B scan) exceeds a threshold value, display of high-quality moving images (continuous display of high-quality frames) may be automatically started. Further, when an evaluation value such as image quality evaluation of a tomographic image (B-scan) exceeds a threshold value, the image quality improvement button may be changed to a state (active state) in which the examiner can designate.

また、例えば、スキャンパターン等が異なる撮影モード毎に異なる高画質化用の学習済モデルを用意して、選択された撮影モードに対応する高画質化用の学習済モデルが選択されるように構成されてもよい。また、異なる撮影モードで得た様々な医用画像を含む学習データを学習して得た１つの高画質化用の学習済モデルが用いられてもよい。 In addition, for example, a different trained model for improving image quality is prepared for each imaging mode with a different scan pattern, etc., and the trained model for improving image quality corresponding to the selected imaging mode is selected. may be Alternatively, one trained model for improving image quality obtained by learning learning data including various medical images obtained in different imaging modes may be used.

［変形例７］
上述した実施形態及び変形例においては、各種学習済モデルが追加学習の実行中である場合、追加学習の実行中の学習済モデル自体を用いて出力（推定・予測）することが難しい可能性がある。このため、追加学習の実行中の学習済モデルに対する学習データ以外の医用画像の入力を禁止するように構成されることがよい。また、追加学習の実行前の学習済モデルと同じ学習済モデルをもう一つ予備の学習済モデルとして用意してもよい。このとき、追加学習の実行中には、予備の学習済モデルに対する学習データ以外の医用画像の入力が実行可能なように構成されることがよい。そして、追加学習が完了した後に、追加学習の実行後の学習済モデルを評価し、問題がなければ、予備の学習済モデルから追加学習の実行後の学習済モデルに置き換えればよい。また、問題があれば、予備の学習済モデルが用いられるようにしてもよい。 [Modification 7]
In the above-described embodiments and modifications, when various trained models are undergoing additional learning, it may be difficult to output (estimate/predict) using the trained models themselves that are undergoing additional learning. be. Therefore, it is preferable to prohibit the input of medical images other than learning data to a trained model during additional learning. Also, another trained model that is the same as the trained model before execution of additional learning may be prepared as a backup trained model. At this time, it is preferable to be configured so that input of medical images other than learning data to the preliminary trained model can be executed during execution of additional learning. Then, after the additional learning is completed, the trained model after the execution of the additional learning is evaluated, and if there is no problem, the spare trained model can be replaced with the trained model after the execution of the additional learning. Also, if there is a problem, a backup trained model may be used.

なお、追加学習の実行後の学習済モデルの評価としては、例えば、高画質化用の学習済モデルで得た高画質画像を他の種類の画像と分類するための分類用の学習済モデルが用いられてもよい。分類用の学習済モデルは、例えば、高画質化用の学習済モデルで得た高画質画像と低画質画像とを含む複数の画像を入力データとし、これらの画像の種類がラベル付け（アノテーション）されたデータを正解データとして含む学習データを学習して得た学習済モデルであってもよい。このとき、推定時（予測時）の入力データの画像の種類が、学習時の正解データに含まれる画像の種類毎の確からしさを示す情報（例えば、割合を示す数値）と合わせて表示されてもよい。なお、分類用の学習済モデルの入力データとしては、上記の画像以外にも、複数の低画質画像の重ね合わせ処理（例えば、位置合わせして得た複数の低画質画像の平均化処理）等によって、高コントラスト化やノイズ低減等が行われたような高画質な画像が含まれてもよい。また、追加学習の実行後の学習済モデルの評価としては、例えば、追加学習の実行後の学習済モデルと追加学習の実行前の学習済モデル（予備の学習済モデル）とをそれぞれ用いて同一の画像から得た複数の高画質画像を比較、あるいは該複数の高画質画像の解析結果を比較してもよい。このとき、例えば、該複数の高画質画像の比較結果（追加学習による変化の一例）、あるいは該複数の高画質画像の解析結果の比較結果（追加学習による変化の一例）が所定の範囲であるか否かを判定し、判定結果が表示されてもよい。 As for the evaluation of the trained model after execution of additional learning, for example, a trained model for classification is used to classify high-quality images obtained by the trained model for high-quality images from other types of images. may be used. For the trained model for classification, for example, multiple images including high-quality images and low-quality images obtained by the trained model for high image quality are input data, and the types of these images are labeled (annotated). It may be a trained model obtained by learning learning data that includes the obtained data as correct data. At this time, the type of image of the input data at the time of estimation (prediction) is displayed together with information indicating the likelihood of each type of image included in the correct data at the time of learning (for example, a numerical value indicating the ratio). good too. As input data for the trained model for classification, in addition to the above images, superimposition processing of multiple low-quality images (for example, averaging processing of multiple low-quality images obtained by alignment), etc. A high-quality image that has undergone high-contrast, noise reduction, or the like may be included. In addition, for evaluation of the trained model after execution of additional learning, for example, the trained model after execution of additional learning and the trained model before execution of additional learning (preliminary trained model) are used respectively. A plurality of high-quality images obtained from the images may be compared, or analysis results of the plurality of high-quality images may be compared. At this time, for example, a comparison result of the plurality of high-quality images (an example of change due to additional learning) or a comparison result of the analysis results of the plurality of high-quality images (an example of change due to additional learning) is within a predetermined range. It may be determined whether or not, and the determination result may be displayed.

また、撮影部位毎に学習して得た学習済モデルを選択的に利用できるようにしてもよい。具体的には、第１の撮影部位（例えば、前眼部、後眼部等）を含む学習データを用いて得た第１の学習済モデルと、第１の撮影部位とは異なる第２の撮影部位を含む学習データを用いて得た第２の学習済モデルと、を含む複数の学習済モデルを用意することができる。そして、画像処理装置１０１は、これら複数の学習済モデルのいずれかを選択する選択手段を有してもよい。このとき、画像処理装置１０１は、選択された学習済モデルに対して追加学習を実行する制御手段を有してもよい。制御手段は、検者からの指示に応じて、選択された学習済モデルに対応する撮影部位と該撮影部位の撮影画像とがペアとなるデータを検索し、検索して得たデータを学習データとする学習を、選択された学習済モデルに対して追加学習として実行することができる。なお、選択された学習済モデルに対応する撮影部位は、データのヘッダの情報から取得したり、検者により手動入力されたりしたものであってよい。また、データの検索は、例えば、病院や研究所等の外部施設のサーバ等からネットワークを介して行われてよい。これにより、学習済モデルに対応する撮影部位の撮影画像を用いて、撮影部位毎に効率的に追加学習することができる。 Also, a learned model obtained by learning for each imaging region may be selectively used. Specifically, a first trained model obtained using learning data including a first imaging region (for example, an anterior segment, a posterior segment, etc.) and a second model different from the first imaging region. It is possible to prepare a plurality of trained models, including a second trained model obtained using learning data including the imaging part. The image processing apparatus 101 may have selection means for selecting one of these learned models. At this time, the image processing apparatus 101 may have control means for performing additional learning on the selected trained model. In response to an instruction from the examiner, the control means searches for data paired with an imaging region corresponding to the selected learned model and a photographed image of the imaging region, and uses the retrieved data as learning data. can be performed as additional learning on the selected trained model. Note that the imaged region corresponding to the selected learned model may be obtained from information in the header of the data or manually input by the examiner. Also, data retrieval may be performed via a network from, for example, a server of an external facility such as a hospital or research institute. As a result, additional learning can be efficiently performed for each imaging part using the photographed image of the imaging part corresponding to the learned model.

なお、選択手段及び制御手段は、画像処理装置１０１のＣＰＵやＭＰＵ等のプロセッサーによって実行されるソフトウェアモジュールにより構成されてよい。また、選択手段及び制御手段は、ＡＳＩＣ等の特定の機能を果たす回路や独立した装置等によって構成されてもよい。 Note that the selection means and the control means may be configured by a software module executed by a processor such as the CPU or MPU of the image processing apparatus 101 . Also, the selection means and the control means may be configured by a circuit such as an ASIC that performs a specific function, an independent device, or the like.

また、追加学習用の学習データを、病院や研究所等の外部施設のサーバ等からネットワークを介して取得する際には、改ざんや、追加学習時のシステムトラブル等による信頼性低下を低減することが有用である。そこで、デジタル署名やハッシュ化による一致性の確認を行うことで、追加学習用の学習データの正当性を検出してもよい。これにより、追加学習用の学習データを保護することができる。このとき、デジタル署名やハッシュ化による一致性の確認した結果として、追加学習用の学習データの正当性が検出できなかった場合には、その旨の警告を行い、その学習データによる追加学習を行わないものとする。なお、サーバは、その設置場所を問わず、例えば、クラウドサーバ、フォグサーバ、エッジサーバ等のどのような形態でもよい。なお、施設内や、施設が含まれる敷地内、複数の施設が含まれる地域内等のネットワークを無線通信可能に構成する場合には、例えば、施設や、敷地、地域等に限定で割り当てられた専用の波長帯域の電波を用いるように構成することで、ネットワークの信頼性を向上させてもよい。また、高速や、大容量、低遅延、多数同時接続が可能な無線通信によりネットワークが構成されてもよい。 In addition, when acquiring learning data for additional learning from a server of an external facility such as a hospital or research institute via a network, it is necessary to reduce reliability deterioration due to falsification and system troubles during additional learning. is useful. Therefore, the correctness of the learning data for additional learning may be detected by confirming the matching by digital signature or hashing. Thereby, learning data for additional learning can be protected. At this time, if the validity of the learning data for additional learning cannot be detected as a result of confirming the match by digital signature or hashing, a warning to that effect is issued and additional learning is performed using the learning data. Make it not exist. It should be noted that the server may take any form such as a cloud server, a fog server, an edge server, etc., regardless of its installation location. In addition, when configuring a network within a facility, within a site that includes a facility, within an area that includes multiple facilities, etc., for wireless communication, for example, Reliability of the network may be improved by configuring to use radio waves of a dedicated wavelength band. Alternatively, the network may be configured by wireless communication capable of high speed, large capacity, low delay, and multiple simultaneous connections.

また、上述したような一致性の確認によるデータの保護は、追加学習用の学習データに限らず、医用画像を含むデータに適用可能である。また、複数の施設のサーバの間の医用画像を含むデータの取引が分散型のネットワークにより管理されるように画像管理システムが構成されてもよい。また、取引履歴と、前のブロックのハッシュ値とが一緒に記録された複数のブロックを時系列につなぐように画像管理システムが構成されてもよい。なお、一致性の確認等を行うための技術としては、量子ゲート方式等の量子コンピュータを用いても計算が困難な暗号（例えば、格子暗号、量子鍵配送による量子暗号等）が用いられてもよい。ここで、画像管理システムは、撮影装置によって撮影された画像や画像処理された画像を受信して保存する装置及びシステムであってもよい。また、画像管理システムは、接続された装置の要求に応じて画像を送信したり、保存された画像に対して画像処理を行ったり、画像処理の要求を他の装置に要求したりすることができる。画像管理システムとしては、例えば、画像保存通信システム（ＰＡＣＳ）を含むことができる。また、画像管理システムは、受信した画像とともに関連付けられた被検者の情報や撮影時間などの各種情報も保存可能なデータベースを備える。また、画像管理システムはネットワークに接続され、他の装置からの要求に応じて、画像を送受信したり、画像を変換したり、保存した画像に関連付けられた各種情報を送受信したりすることができる。 Moreover, the protection of data by confirming consistency as described above is applicable not only to learning data for additional learning but also to data including medical images. In addition, the image management system may be configured such that transactions of data including medical images between servers at multiple facilities are managed by a distributed network. In addition, the image management system may be configured to chronologically connect a plurality of blocks in which the transaction history and the hash value of the previous block are recorded together. As a technology for confirming consistency, cryptography that is difficult to calculate even using a quantum computer such as a quantum gate system (e.g., lattice cryptography, quantum cryptography with quantum key distribution, etc.) may be used. good. Here, the image management system may be a device or system that receives and stores an image captured by an image capturing device or an image that has undergone image processing. In addition, the image management system can transmit images in response to requests from connected devices, perform image processing on stored images, and request other devices to perform image processing. can. The image management system can include, for example, a picture archival communication system (PACS). The image management system also includes a database capable of storing various types of information such as subject information and imaging time associated with the received images. Also, the image management system is connected to a network and can send and receive images, convert images, and send and receive various information associated with saved images in response to requests from other devices. .

なお、各種学習済モデルについて、追加学習を行う際には、ＧＰＵを用いて高速に処理を行うことができる。ＧＰＵは、データをより多く並列処理することで効率的な演算を行うことができるため、ディープラーニングのような学習モデルを用いて複数回に渡り学習を行う場合にはＧＰＵで処理を行うことが有効である。なお、追加学習の処理は、ＧＰＵとＣＰＵ等が協働して行ってもよい。 When performing additional learning on various trained models, the GPU can be used to perform high-speed processing. GPUs can perform efficient calculations by processing more data in parallel, so when learning models such as deep learning are used for multiple times, GPUs can be used for processing. It is valid. Note that the additional learning process may be performed in cooperation with the GPU and the CPU.

［変形例８］
上述した様々な実施形態及び変形例において、検者からの指示は、手動による指示（例えば、ユーザーインターフェース等を用いた指示）以外にも、音声等による指示であってもよい。このとき、例えば、機械学習により得た音声認識モデル（音声認識エンジン、音声認識用の学習済モデル）を含む機械学習モデルが用いられてもよい。また、手動による指示は、キーボードやタッチパネル等を用いた文字入力等による指示であってもよい。このとき、例えば、機械学習により得た文字認識モデル（文字認識エンジン、文字認識用の学習済モデル）を含む機械学習モデルが用いられてもよい。また、検者からの指示は、ジェスチャー等による指示であってもよい。このとき、機械学習により得たジェスチャー認識モデル（ジェスチャー認識エンジン、ジェスチャー認識用の学習済モデル）を含む機械学習モデルが用いられてもよい。 [Modification 8]
In the various embodiments and modifications described above, the instruction from the examiner may be an instruction by voice or the like in addition to a manual instruction (for example, an instruction using a user interface or the like). At this time, for example, a machine learning model including a speech recognition model (speech recognition engine, trained model for speech recognition) obtained by machine learning may be used. Further, the manual instruction may be an instruction by character input using a keyboard, touch panel, or the like. At this time, for example, a machine learning model including a character recognition model (a character recognition engine, a learned model for character recognition) obtained by machine learning may be used. Also, the instruction from the examiner may be an instruction by a gesture or the like. At this time, a machine learning model including a gesture recognition model (a gesture recognition engine, a trained model for gesture recognition) obtained by machine learning may be used.

また、検者からの指示は、出力部１０３における表示画面上の検者の視線検出結果等であってもよい。視線検出結果は、例えば、出力部１０３における表示画面の周辺から撮影して得た検者の動画像を用いた瞳孔検出結果であってもよい。このとき、動画像からの瞳孔検出は、上述したような物体認識エンジンを用いてもよい。また、検者からの指示は、脳波、体を流れる微弱な電気信号等による指示であってもよい。 Further, the instruction from the examiner may be the sight line detection result of the examiner on the display screen of the output unit 103, or the like. The line-of-sight detection result may be, for example, a pupil detection result using a moving image of the examiner captured from the periphery of the display screen in the output unit 103 . At this time, the object recognition engine as described above may be used for pupil detection from moving images. Further, the instructions from the examiner may be instructions based on brain waves, weak electrical signals flowing through the body, or the like.

このような場合、例えば、学習データとしては、上述したような種々の学習済モデルの処理による結果の表示の指示を示す文字データ又は音声データ（波形データ）等を入力データとし、種々の学習済モデルの処理による結果等を実際に出力部１０３に表示させるための実行命令を正解データとする学習データであってもよい。また、学習データとしては、例えば、撮影パラメータの自動設定を行うか否かの実行命令及び当該命令用のボタンをアクティブ状態に変更するための実行命令等を正解データとする学習データであってもよい。さらに、学習データとしては、例えば、高画質画像を得るために高画質化処理を行うか否かの実行命令及び当該命令用のボタンをアクティブ状態に変更するための実行命令等を正解データとする学習データであってもよい。なお、学習データとしては、例えば、文字データ又は音声データ等が示す指示内容と実行命令内容とが互いに対応するものであれば何でもよい。また、音響モデルや言語モデル等を用いて、音声データから文字データに変換してもよい。また、複数のマイクで得た波形データを用いて、音声データに重畳しているノイズデータを低減する処理を行ってもよい。また、文字又は音声等による指示と、マウス又はタッチパネル等による指示とを、検者からの指示に応じて選択可能に構成されてもよい。また、文字又は音声等による指示のオン・オフを、検者からの指示に応じて選択可能に構成されてもよい。 In such a case, for example, as the learning data, character data or voice data (waveform data) indicating instructions for displaying the results of the processing of the various learned models as described above may be used as input data. The learning data may be learning data in which correct data is an execution command for actually displaying the result of model processing on the output unit 103 . Further, as the learning data, for example, even if it is learning data having correct data such as an execution command for whether or not to automatically set shooting parameters and an execution command for changing the button for that command to an active state, etc. good. Furthermore, as the learning data, for example, an execution command for whether or not to perform image quality enhancement processing in order to obtain a high quality image, an execution command for changing the button for the command to an active state, etc. are used as correct data. It may be learning data. Any learning data may be used as long as the contents of instructions indicated by character data or voice data and the contents of execution commands correspond to each other. Alternatively, speech data may be converted into character data using an acoustic model, a language model, or the like. Also, waveform data obtained by a plurality of microphones may be used to perform processing for reducing noise data superimposed on audio data. Further, it may be configured such that an instruction by text, voice, or the like and an instruction by a mouse, a touch panel, or the like can be selected according to an instruction from the examiner. Moreover, it may be configured such that ON/OFF of instructions by text, voice, or the like can be selected according to instructions from the examiner.

ここで、機械学習には、上述したような深層学習があり、また、多階層のニューラルネットワークの少なくとも一部には、例えば、ＲＮＮを用いることができる。ここで、本変形例に係る機械学習モデルの一例として、時系列情報を扱うニューラルネットワークであるＲＮＮに関して、図１５（ａ）及び図１５（ｂ）を参照して説明する。また、ＲＮＮの一種であるＬｏｎｇｓｈｏｒｔ－ｔｅｒｍｍｅｍｏｒｙ（以下、ＬＳＴＭ）に関して、図１６（ａ）及び図１６（ｂ）を参照して説明する。 Here, machine learning includes deep learning as described above, and RNN, for example, can be used for at least part of the multi-layered neural network. Here, as an example of the machine learning model according to this modification, an RNN, which is a neural network that handles time-series information, will be described with reference to FIGS. 15(a) and 15(b). Also, a long short-term memory (hereinafter referred to as LSTM), which is a type of RNN, will be described with reference to FIGS. 16(a) and 16(b).

図１５（ａ）は、機械学習モデルであるＲＮＮの構造を示す。ＲＮＮ１５２０は、ネットワークにループ構造を持ち、時刻ｔにおいてデータｘ^ｔ１５１０が入力され、データｈ^ｔ１５３０を出力する。ＲＮＮ１５２０はネットワークにループ機能を持つため、現時刻の状態を次の状態に引き継ぐことが可能であるため、時系列情報を扱うことができる。図１５（ｂ）には時刻ｔにおけるパラメータベクトルの入出力の一例を示す。データｘ^ｔ１５１０にはＮ個（Ｐａｒａｍｓ１～ＰａｒａｍｓＮ）のデータが含まれる。また、ＲＮＮ１５２０より出力されるデータｈ^ｔ１５３０には入力データに対応するＮ個（Ｐａｒａｍｓ１～ＰａｒａｍｓＮ）のデータが含まれる。 FIG. 15(a) shows the structure of RNN, which is a machine learning model. RNN 1520 has a loop structure in the network, receives data x ^t 1510 at time t, and outputs data h ^t 1530 . Since the RNN 1520 has a loop function in the network, it is possible to take over the state of the current time to the next state, so it can handle time-series information. FIG. 15B shows an example of input and output of parameter vectors at time t. The data x ^t 1510 includes N (Params1 to ParamsN) data. Data h ^t 1530 output from the RNN 1520 includes N pieces of data (Params1 to ParamsN) corresponding to the input data.

しかしながら、ＲＮＮでは誤差逆伝搬時に長期時間の情報を扱うことができないため、ＬＳＴＭが用いられることがある。ＬＳＴＭは、忘却ゲート、入力ゲート、及び出力ゲートを備えることで長期時間の情報を学習することができる。ここで、図１６（ａ）にＬＳＴＭの構造を示す。ＬＳＴＭ１６４０において、ネットワークが次の時刻ｔに引き継ぐ情報は、セルと呼ばれるネットワークの内部状態ｃ^ｔ－１と出力データｈ^ｔ－１である。なお、図の小文字（ｃ、ｈ、ｘ）はベクトルを表している。 However, since RNN cannot handle long-term information during error backpropagation, LSTM is sometimes used. The LSTM can learn long-term information by having a forget gate, an input gate, and an output gate. Here, the structure of LSTM is shown in FIG. 16(a). In LSTM 1640, the information that the network takes over at the next time t is the internal state c ^t-1 of the network called a cell and the output data h ^t-1 . Note that the lower case letters (c, h, x) in the figure represent vectors.

次に、図１６（ｂ）にＬＳＴＭ１６４０の詳細を示す。図１６（ｂ）においては、忘却ゲートネットワークＦＧ、入力ゲートネットワークＩＧ、及び出力ゲートネットワークＯＧが示され、それぞれはシグモイド層である。そのため、各要素が０から１の値となるベクトルを出力する。忘却ゲートネットワークＦＧは過去の情報をどれだけ保持するかを決め、入力ゲートネットワークＩＧはどの値を更新するかを判定するものである。また、図１６（ｂ）においては、セル更新候補ネットワークＣＵが示され、セル更新候補ネットワークＣＵは活性化関数ｔａｎｈ層である。これは、セルに加えられる新たな候補値のベクトルを作成する。出力ゲートネットワークＯＧは、セル候補の要素を選択し次の時刻にどの程度の情報を伝えるか選択する。 Next, the details of the LSTM 1640 are shown in FIG. 16(b). In FIG. 16(b), a forget gate network FG, an input gate network IG and an output gate network OG are shown, each being a sigmoid layer. Therefore, a vector in which each element is a value between 0 and 1 is output. The forget gate network FG determines how much past information is retained, and the input gate network IG determines which values are updated. Also, in FIG. 16(b), the cell update candidate network CU is shown, and the cell update candidate network CU is the activation function tanh layer. This creates a vector of new candidate values to be added to the cell. The output gating network OG selects elements of the cell candidates and how much information to convey the next time.

なお、上述したＬＳＴＭのモデルは基本形であるため、ここで示したネットワークに限らない。ネットワーク間の結合を変更してもよい。ＬＳＴＭではなく、ＱＲＮＮ（ＱｕａｓｉＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋ）を用いてもよい。さらに、機械学習モデルは、ニューラルネットワークに限定されるものではなく、ブースティングやサポートベクターマシン等が用いられてもよい。また、検者からの指示が文字又は音声等による入力の場合には、自然言語処理に関する技術（例えば、ＳｅｑｕｅｎｃｅｔｏＳｅｑｕｅｎｃｅ）が適用されてもよい。このとき、自然言語処理に関する技術としては、例えば、入力される文章毎に出力されるモデルが適用されてもよい。また、上述した種々の学習済モデルは、検者からの指示に限らず、検者に対する出力に適用されてもよい。また、検者に対して文字又は音声等による出力で応答する対話エンジン（対話モデル、対話用の学習済モデル）が適用されてもよい。 Note that the above-described LSTM model is a basic model, so it is not limited to the network shown here. Coupling between networks may be changed. QRNN (Quasi Recurrent Neural Network) may be used instead of LSTM. Furthermore, machine learning models are not limited to neural networks, and boosting, support vector machines, and the like may be used. In addition, when the instruction from the examiner is input by text, voice, or the like, a technique related to natural language processing (for example, sequence to sequence) may be applied. At this time, as a technique related to natural language processing, for example, a model that is output for each sentence that is input may be applied. Moreover, the various learned models described above may be applied not only to instructions from the examiner but also to output to the examiner. Also, a dialogue engine (dialogue model, learned model for dialogue) that responds to the examiner with text or voice output may be applied.

また、自然言語処理に関する技術としては、文書データを教師なし学習により事前学習して得た学習済モデルが用いられてもよい。また、自然言語処理に関する技術としては、事前学習して得た学習済モデルをさらに目的に応じて転移学習（あるいはファインチューニング）して得た学習済モデルが用いられてもよい。また、自然言語処理に関する技術としては、例えば、ＢＥＲＴ（ＢｉｄｉｒｅｃｔｉｏｎａｌＥｎｃｏｄｅｒＲｅｐｒｅｓｅｎｔａｔｉｏｎｓｆｒｏｍＴｒａｎｓｆｏｒｍｅｒｓ）が適用されてもよい。また、自然言語処理に関する技術としては、文章内の特定の単語を左右両方の文脈から予測することで、文脈（特徴量）を自ら抽出（表現）可能なモデルが適用されてもよい。また、自然言語処理に関する技術としては、入力される時系列データにおける２つのシーケンス（センテンス）の関係性（連続性）を判断可能なモデルが適用されてもよい。また、自然言語処理に関する技術としては、隠れ層にＴｒａｎｓｆｏｒｍｅｒのＥｎｃｏｄｅｒが用いられ、ベクトルのシーケンスが入力、出力されるモデルが適用されてもよい。 Also, as a technique related to natural language processing, a trained model obtained by pre-learning document data by unsupervised learning may be used. Also, as a technique related to natural language processing, a trained model obtained by performing transfer learning (or fine-tuning) on a trained model obtained by pre-learning may be used according to the purpose. Also, as a technology related to natural language processing, for example, BERT (Bidirectional Encoder Representations from Transformers) may be applied. Also, as a technique related to natural language processing, a model that can extract (express) the context (feature value) by itself by predicting a specific word in a sentence from both the left and right contexts may be applied. Also, as a technique related to natural language processing, a model that can determine the relationship (continuity) between two sequences (sentences) in input time-series data may be applied. In addition, as a technique related to natural language processing, a model in which Encoder of Transformer is used in the hidden layer and a sequence of vectors is input and output may be applied.

ここで、本変形例が適用可能な検者からの指示は、上述した様々な実施形態及び変形例に記載のような種々の画像や解析結果の表示の変更、Ｅｎ－Ｆａｃｅ画像の生成のための深度範囲の選択、追加学習用の学習データとして用いるか否かの選択、学習済モデルの選択、種々の学習済モデルを用いて得た結果の出力（表示や送信等）や保存等、に関する少なくとも１つの指示であれば何でもよい。また、本変形例が適用可能な検者からの指示は、撮影後の指示だけでなく、撮影前の指示であってもよく、例えば、種々の調整に関する指示、種々の撮影条件の設定に関する指示、撮影開始に関する指示であってもよい。また、本変形例が適用可能な検者からの指示は、表示画面の変更（画面遷移）に関する指示であってもよい。 Here, the instructions from the examiner to which this modification can be applied include changing the display of various images and analysis results as described in the various embodiments and modifications described above, and generating an En-Face image. Selection of depth range, selection of whether to use as learning data for additional learning, selection of trained models, output (display, transmission, etc.) and storage of results obtained using various trained models Any at least one instruction will suffice. Further, the instructions from the examiner to which this modification can be applied may be not only instructions after imaging but also instructions before imaging. , may be an instruction regarding the start of shooting. Further, the instruction from the examiner to which this modified example can be applied may be an instruction regarding change of the display screen (screen transition).

なお、機械学習モデルとしては、ＣＮＮ等の画像に関する機械学習モデルとＲＮＮ等の時系列データに関する機械学習モデルとを組み合わせた機械学習モデルであってもよい。このような機械学習モデルでは、例えば、画像に関する特徴量と時系列データに関する特徴量との関係性を学習することができる。機械学習モデルの入力層側がＣＮＮで、出力層側がＲＮＮである場合には、例えば、医用画像を入力データとし、該医用画像に関する文章（例えば、病変の有無、病変の種類、次の検査のレコメンド等）を出力データとする学習データを用いて学習が行われてもよい。これにより、例えば、医用画像に関する医療情報が自動的に文章で説明されるため、医療経験が浅い検者であっても、医用画像に関する医療情報を容易に把握することができる。また、機械学習モデルの入力層側がＲＮＮで、出力層側がＣＮＮである場合には、例えば、病変、所見、診断等の医療に関する文章を入力データとし、該医療に関する文章に対応する医用画像を出力データとする学習データを用いて学習が行われてもよい。これにより、例えば、検者が確認したい症例に関係する医用画像を容易に検索することができる。 The machine learning model may be a machine learning model that combines a machine learning model for images such as CNN and a machine learning model for time-series data such as RNN. With such a machine learning model, for example, it is possible to learn the relationship between the feature amount related to images and the feature amount related to time-series data. When the input layer side of the machine learning model is CNN and the output layer side is RNN, for example, a medical image is used as input data, and sentences related to the medical image (for example, the presence or absence of a lesion, the type of lesion, recommendations for the next examination etc.) may be used as output data for learning. As a result, for example, the medical information related to the medical image is automatically explained in sentences, so that even an examiner with little medical experience can easily understand the medical information related to the medical image. In addition, when the input layer side of the machine learning model is RNN and the output layer side is CNN, for example, medical sentences such as lesions, findings, and diagnoses are input data, and medical images corresponding to the medical sentences are output. Learning may be performed using learning data as data. As a result, for example, the examiner can easily search for medical images related to the case that the examiner wants to check.

また、検者からの指示や検者に対する出力には、文字や音声等の文章を任意の言語に機械翻訳する機械翻訳エンジン（機械翻訳モデル、機械翻訳用の学習済モデル）が用いられてもよい。なお、任意の言語は、検者からの指示に応じて選択可能に構成されてもよい。また、任意の言語は、言語の種類を自動認識する学習済モデルを用いることで自動選択可能に構成されてもよい。また、自動選択された言語の種類を検者からの指示に応じて修正可能に構成されてもよい。機械翻訳エンジンには、例えば、上述した自然言語処理に関する技術（例えば、ＳｅｑｕｅｎｃｅｔｏＳｅｑｕｅｎｃｅ）が適用されてもよい。例えば、機械翻訳エンジンに入力された文章が機械翻訳された後に、機械翻訳された文章を文字認識エンジン等に入力するように構成されてもよい。また、例えば、上述した種々の学習済モデルから出力された文章を機械翻訳エンジンに入力し、機械翻訳エンジンから出力された文章が出力されるように構成されてもよい。 In addition, even if a machine translation engine (machine translation model, trained model for machine translation) that machine-translates sentences such as letters and sounds into any language is used for instructions from the examiner and output to the examiner good. Any language may be configured to be selectable according to instructions from the examiner. Also, any language may be automatically selected by using a trained model that automatically recognizes the type of language. Further, the automatically selected language type may be configured to be modifiable according to instructions from the examiner. For example, the above-described natural language processing technology (for example, sequence to sequence) may be applied to the machine translation engine. For example, after a text input to a machine translation engine is machine-translated, the machine-translated text may be input to a character recognition engine or the like. Further, for example, the sentences output from the various learned models described above may be input to a machine translation engine, and the sentences output from the machine translation engine may be output.

また、上述した種々の学習済モデルが組み合わせて用いられてもよい。例えば、検者からの指示に対応する文字が文字認識エンジンに入力され、入力された文字から得た音声を他の種類の機械学習エンジン（例えば、機械翻訳エンジン等）に入力されるように構成されてもよい。また、例えば、他の種類の機械学習エンジンから出力された文字が文字認識エンジンに入力され、入力された文字から得た音声が出力されるように構成されてもよい。また、例えば、検者からの指示に対応する音声が音声認識エンジンに入力され、入力された音声から得た文字を他の種類の機械学習エンジン（例えば、機械翻訳エンジン等）に入力されるように構成されてもよい。また、例えば、他の種類の機械学習エンジンから出力された音声が音声認識エンジンに入力され、入力された音声から得た文字が出力部１０３に表示されるように構成されてもよい。このとき、検者に対する出力として文字による出力か音声による出力かを、検者からの指示に応じて選択可能に構成されてもよい。また、検者からの指示として文字による入力か音声による入力かを、検者からの指示に応じて選択可能に構成されてもよい。また、検者からの指示による選択によって、上述した種々の構成が採用されるようにしてもよい。 Also, the various learned models described above may be used in combination. For example, characters corresponding to instructions from the examiner are input to a character recognition engine, and speech obtained from the input characters is input to another type of machine learning engine (for example, a machine translation engine, etc.). may be Alternatively, for example, characters output from another type of machine learning engine may be input to a character recognition engine, and voice obtained from the input characters may be output. Also, for example, a voice corresponding to an instruction from an examiner is input to a voice recognition engine, and characters obtained from the input voice are input to another type of machine learning engine (for example, a machine translation engine, etc.). may be configured to Further, for example, the speech output from another type of machine learning engine may be input to the speech recognition engine, and characters obtained from the input speech may be displayed on the output unit 103 . At this time, the output to the examiner may be configured to be selectable between character output and voice output according to instructions from the examiner. In addition, it may be configured such that input by text or input by voice as an instruction from the examiner can be selected according to the instruction from the examiner. Also, the above-described various configurations may be adopted by selection based on instructions from the examiner.

［変形例９］
本撮影により取得された画像に関するラベル画像や高画質画像等は、操作者からの指示に応じて記憶部１０１－３に保存されてもよい。このとき、例えば、高画質画像を保存するための操作者からの指示の後、ファイル名の登録の際に、推奨のファイル名として、ファイル名のいずれかの箇所（例えば、最初の箇所、又は最後の箇所）に、高画質化用の学習済モデルを用いた処理（高画質化処理）により生成された画像であることを示す情報（例えば、文字）を含むファイル名が、操作者からの指示に応じて編集可能な状態で表示されてもよい。なお、同様に、ラベル画像等についても、学習済モデルを用いた処理により生成された画像である情報を含むファイル名が表示されてもよい。 [Modification 9]
A label image, a high-quality image, and the like related to the image acquired by the actual photographing may be stored in the storage unit 101-3 according to an instruction from the operator. At this time, for example, after an instruction from the operator to save a high-quality image, when registering a file name, any part of the file name (for example, the first part, or last part), the file name containing information (for example, characters) indicating that it is an image generated by processing using a trained model for high image quality (high image quality processing) is entered by the operator. It may be displayed in an editable state according to instructions. Similarly, for the label image and the like, a file name including information that the image is generated by processing using the trained model may be displayed.

また、レポート画面等の種々の表示画面において、出力部１０３に高画質画像を表示させる際に、表示されている画像が高画質化モデルを用いた処理により生成された高画質画像であることを示す表示が、高画質画像とともに表示されてもよい。この場合には、操作者は、当該表示によって、表示された高画質画像が撮影によって取得した画像そのものではないことが容易に識別できるため、誤診断を低減させたり、診断効率を向上させたりすることができる。なお、高画質化モデルを用いた処理により生成された高画質画像であることを示す表示は、入力画像と当該処理により生成された高画質画像とを識別可能な表示であればどのような態様のものでもよい。また、高画質化モデルを用いた処理だけでなく、上述したような種々の学習済モデルを用いた処理についても、その種類の学習済モデルを用いた処理により生成された結果であることを示す表示が、その結果とともに表示されてもよい。例えば、画像セグメンテーション処理用の学習済モデルを用いたセグメンテーション結果の解析結果を表示する際にも、画像セグメンテーション用の学習済モデルを用いた結果に基づいた解析結果であることを示す表示が、解析結果とともに表示されてもよい。 Also, when displaying a high-quality image on the output unit 103 on various display screens such as a report screen, it is necessary to confirm that the displayed image is a high-quality image generated by processing using a high-quality image model. A descriptive indication may be displayed along with the high quality image. In this case, the operator can easily identify from the display that the displayed high-quality image is not the image itself acquired by photography, thereby reducing misdiagnosis and improving diagnostic efficiency. be able to. It should be noted that the display indicating that the image is a high-quality image generated by processing using the high-quality image model may be any mode as long as the input image and the high-quality image generated by the processing are identifiable. may be of In addition to the processing using the high image quality model, the processing using various trained models as described above is also shown to be the result generated by the processing using the type of trained model. A display may be displayed with the results. For example, when displaying the analysis result of the segmentation result using the trained model for image segmentation processing, the display indicating that the analysis result is based on the result of using the trained model for image segmentation May be displayed with results.

このとき、レポート画面等の表示画面は、操作者からの指示に応じて、画像データとして記憶部１０１－３に保存されてもよい。例えば、高画質画像等と、これらの画像が学習済モデルを用いた処理により生成された画像であることを示す表示とが並んだ１つの画像としてレポート画面が記憶部１０１－３に保存されてもよい。 At this time, the display screen such as the report screen may be stored as image data in the storage unit 101-3 according to an instruction from the operator. For example, a report screen is stored in storage unit 101-3 as a single image in which high-quality images, etc., and an indication indicating that these images are images generated by processing using a learned model are arranged. good too.

また、高画質化モデルを用いた処理により生成された高画質画像であることを示す表示について、高画質化モデルがどのような学習データによって学習を行ったものであるかを示す表示が出力部１０３に表示されてもよい。当該表示としては、学習データの入力データと正解データの種類の説明や、入力データと正解データに含まれる撮影部位等の正解データに関する任意の表示を含んでよい。なお、例えば画像セグメンテーション処理等上述した種々の学習済モデルを用いた処理についても、その種類の学習済モデルがどのような学習データによって学習を行ったものであるかを示す表示が出力部１０３に表示されてもよい。 In addition, regarding the display indicating that the image is a high-quality image generated by processing using the high-quality model, the display indicating what kind of learning data the high-quality model has learned is displayed on the output unit. 103 may be displayed. The display may include an explanation of the types of input data and correct data of the learning data, and any display related to correct data such as imaging regions included in the input data and correct data. It should be noted that, for example, for processing using the above-described various trained models such as image segmentation processing, the output unit 103 displays information indicating what kind of learning data the trained model of that type has learned. may be displayed.

また、学習済モデルを用いた処理により生成された画像であることを示す情報（例えば、文字）を、画像等に重畳した状態で表示又は保存されるように構成されてもよい。このとき、画像上に重畳する箇所は、撮影対象となる注目部位等が表示されている領域には重ならない領域（例えば、画像の端）であればどこでもよい。また、重ならない領域を判定し、判定された領域に重畳させてもよい。なお、高画質化モデルを用いた処理だけでなく、例えば画像セグメンテーション処理等の上述した種々の学習済モデルを用いた処理により得た画像についても、同様に処理してよい。 Further, information (for example, characters) indicating that the image is generated by processing using a trained model may be displayed or stored in a state superimposed on the image or the like. At this time, the portion to be superimposed on the image may be any region (for example, the end of the image) that does not overlap the region where the region of interest to be imaged is displayed. Alternatively, a non-overlapping region may be determined and superimposed on the determined region. In addition to processing using the image quality enhancement model, images obtained by processing using the above-described various trained models, such as image segmentation processing, may be similarly processed.

また、レポート画面の初期表示画面として、高画質化処理ボタン等がアクティブ状態（高画質化処理がオン）となるようにデフォルト設定されている場合には、検者からの指示に応じて、高画質画像等を含むレポート画面に対応するレポート画像がサーバに送信されるように構成されてもよい。また、当該ボタンがアクティブ状態となるようにデフォルト設定されている場合には、検査終了時（例えば、検者からの指示に応じて、撮影確認画面やプレビュー画面からレポート画面に変更された場合）に、高画質画像等を含むレポート画面に対応するレポート画像がサーバに（自動的に）送信されるように構成されてもよい。このとき、デフォルト設定における各種設定（例えば、レポート画面の初期表示画面におけるＥｎ－Ｆａｃｅ画像の生成のための深度範囲、解析マップの重畳の有無、高画質画像か否か、経過観察用の表示画面か否か等の少なくとも１つに関する設定）に基づいて生成されたレポート画像がサーバに送信されるように構成されてもよい。なお、当該ボタンが画像セグメンテーション処理の切り替えを表す場合に関しても、同様に処理されてよい。 In addition, if the initial display screen of the report screen is set by default so that the high image quality processing button, etc. is in an active state (high image quality processing is turned on), high It may be configured such that a report image corresponding to the report screen including the quality image and the like is transmitted to the server. Also, if the button is set to be active by default, at the end of the examination (for example, when the imaging confirmation screen or preview screen is changed to the report screen in response to instructions from the examiner) Alternatively, the report image corresponding to the report screen including the high-quality image and the like may be (automatically) transmitted to the server. At this time, various settings in the default settings (for example, the depth range for generating the En-Face image on the initial display screen of the report screen, whether or not the analysis map is superimposed, whether or not it is a high-quality image, the display screen for follow-up observation setting regarding at least one of whether or not the report image generated based on the setting is transmitted to the server. It should be noted that the same processing may be performed even when the button indicates switching of the image segmentation processing.

［変形例１０］
上述した実施形態及び変形例において、上述したような種々の学習済モデルのうち、第１の種類の学習済モデルで得た画像（例えば、高画質画像、解析マップ等の解析結果を示す画像、領域認識結果を示す画像、セグメンテーション結果を示す画像）を、第１の種類とは異なる第２の種類の学習済モデルに入力してもよい。このとき、第２の種類の学習済モデルの処理による結果（例えば、解析結果、診断結果、領域認識結果、セグメンテーション結果）が生成されるように構成されてもよい。 [Modification 10]
In the above-described embodiments and modifications, among the various trained models as described above, an image obtained by the first type of trained model (for example, a high-quality image, an image showing an analysis result such as an analysis map, An image showing a region recognition result, an image showing a segmentation result) may be input to a second type of trained model different from the first type. At this time, it may be configured such that a result (for example, an analysis result, a diagnosis result, a region recognition result, a segmentation result) by processing the second type of trained model is generated.

また、上述したような種々の学習済モデルのうち、第１の種類の学習済モデルの処理による結果（例えば、解析結果、診断結果、領域認識結果、セグメンテーション結果）を用いて、第１の種類の学習済モデルに入力した画像から、第１の種類とは異なる第２の種類の学習済モデルに入力する画像を生成してもよい。このとき、生成された画像は、第２の種類の学習済モデルを用いて処理する画像として適した画像である可能性が高い。このため、生成された画像を第２の種類の学習済モデルに入力して得た画像（例えば、高画質画像、解析マップ等の解析結果を示す画像、領域認識結果を示す画像、セグメンテーション結果を示す画像）の精度を向上することができる。 Further, among the various learned models as described above, the results of processing of the first type of learned model (for example, analysis results, diagnosis results, region recognition results, segmentation results) are used to obtain the first type An image to be input to a second type of trained model different from the first type may be generated from the image input to the trained model of . At this time, the generated image is highly likely to be an image suitable for processing using the second type of trained model. For this reason, images obtained by inputting the generated images into the second type of trained model (for example, high-quality images, images showing analysis results such as analysis maps, images showing region recognition results, segmentation results) image shown) can be improved.

なお、共通の画像が、第１の種類の学習済モデルと第２の種類の学習済モデルとに入力されることで、これらの学習済モデルを用いた各処理結果の生成（あるいは表示）を実行するように構成されてもよい。このとき、例えば、検者からの指示に応じて、これらの学習済モデルを用いた各処理結果の生成（あるいは表示）を一括して（連動して）実行するように構成されてもよい。 By inputting a common image to the first type of trained model and the second type of trained model, it is possible to generate (or display) each processing result using these trained models. may be configured to execute At this time, for example, the generation (or display) of each processing result using these learned models may be collectively (in conjunction with) executed according to an instruction from the examiner.

また、入力させる画像の種類（例えば、高画質画像、領域認識結果、物体認識結果、セグメンテーション結果、類似症例画像）、生成（あるいは表示）させる処理結果の種類（例えば、高画質画像、領域認識結果、診断結果、解析結果、物体認識結果、セグメンテーション結果、類似症例画像）、入力の種類や出力の種類（例えば、文字、音声、言語）等をそれぞれ検者からの指示に応じて選択可能に構成されてもよい。さらに入力の種類は、入力の種類を自動認識する学習済モデルを用いることで自動選択可能に構成されてもよい。また、出力の種類は、入力の種類と対応する（例えば、同じ種類になる）ように自動選択可能に構成されてもよい。さらに、自動選択された種類を検者からの指示に応じて修正可能に構成されてもよい。このとき、選択された種類に応じて少なくとも１つの学習済モデルが選択されるように構成されてもよい。このとき、複数の学習済モデルが選択された場合には、選択された種類に応じて複数の学習済モデルの組み合わせ方（例えば、データを入力させる順番等）が決定されてもよい。なお、例えば、入力させる画像の種類と、生成（あるいは表示）させる処理結果の種類とが、異なるように選択可能に構成されてもよいし、同じである場合には異なるように選択することを促す情報を検者に対して出力するように構成されてもよい。 Also, the type of image to be input (for example, high-quality image, region recognition result, object recognition result, segmentation result, similar case image), the type of processing result to be generated (or displayed) (for example, high-quality image, region recognition result) , diagnosis results, analysis results, object recognition results, segmentation results, similar case images), input types and output types (e.g. text, voice, language), etc. can be selected according to instructions from the examiner. may be Furthermore, the type of input may be configured to be automatically selectable by using a trained model that automatically recognizes the type of input. Also, the type of output may be configured to be automatically selectable so as to correspond (eg, be the same type) as the type of input. Furthermore, the automatically selected type may be configured to be modifiable according to instructions from the examiner. At this time, at least one trained model may be selected according to the selected type. At this time, when a plurality of trained models are selected, the method of combining the plurality of trained models (for example, the order of inputting data, etc.) may be determined according to the selected type. Note that, for example, the type of image to be input and the type of processing result to be generated (or displayed) may be configured to be selectable differently, or if they are the same, they may be selected differently. It may be configured to output prompting information to the examiner.

また、各学習済モデルはどの場所で実行されてもよい。例えば、複数の学習済モデルのうちの一部がクラウドサーバで用いられ、他はフォグサーバやエッジサーバ等の別のサーバで用いられるように構成されてもよい。なお、施設内や、施設が含まれる敷地内、複数の施設が含まれる地域内等のネットワークを無線通信可能に構成する場合には、例えば、施設や、敷地、地域等に限定で割り当てられた専用の波長帯域の電波を用いるように構成することで、ネットワークの信頼性を向上させてもよい。また、高速や、大容量、低遅延、多数同時接続が可能な無線通信によりネットワークが構成されてもよい。これらにより、例えば、硝子体、白内障、緑内障、角膜屈折矯正、外眼等の手術や、レーザ光凝固等の治療が、遠隔であってもリアルタイムに支援することができる。このとき、例えば、これらの手術や治療に関する装置により得た種々の医用画像の少なくとも１つを無線により受信したフォグサーバやエッジサーバ等が種々の学習済モデルの少なくとも１つを用いて得た情報を手術や治療に関する装置に無線で送信するように構成されてもよい。また、例えば、手術や治療に関する装置に無線で受信した情報が、上述したような光学系や光学部材の移動量（ベクトル）であってもよく、この場合、手術や治療に関する装置が自動制御されるように構成されてもよい。また、例えば、検者による操作の支援を目的として、検者の許可を伴う自動制御（半自動制御）として構成されてもよい。 Also, each trained model may be executed at any location. For example, some of the trained models may be used in a cloud server, and others may be used in another server such as a fog server or an edge server. In addition, when configuring a network within a facility, within a site that includes a facility, within an area that includes multiple facilities, etc., for wireless communication, for example, Reliability of the network may be improved by configuring to use radio waves of a dedicated wavelength band. Alternatively, the network may be configured by wireless communication capable of high speed, large capacity, low delay, and multiple simultaneous connections. As a result, for example, vitreous body, cataract, glaucoma, corneal refractive correction, extraocular surgery, laser photocoagulation, and other treatments can be supported in real time even remotely. At this time, for example, information obtained by using at least one of various learned models by a fog server or an edge server that wirelessly receives at least one of various medical images obtained by these devices related to surgery and treatment. may be configured to be wirelessly transmitted to a surgical or therapeutic device. Further, for example, the information received wirelessly by a device related to surgery or treatment may be the amount of movement (vector) of the optical system or optical member as described above. In this case, the device related to surgery or treatment may be automatically controlled. It may be configured as Further, for example, for the purpose of assisting operations by the examiner, automatic control (semi-automatic control) accompanied by permission of the examiner may be configured.

また、上述したような学習済モデルの処理による解析結果や診断結果等を検索キーとして、サーバ等に格納された外部のデータベースを利用した類似症例画像検索を行ってもよい。また、上述したような種々の学習済モデルの処理による物体認識結果やセグメンテーション結果等を検索キーとして、サーバ等に格納された外部のデータベースを利用した類似症例画像検索を行ってもよい。なお、データベースにおいて保存されている複数の医用画像が、既に機械学習等によって該複数の医用画像それぞれの特徴量を付帯情報として付帯された状態で管理されている場合等には、医用画像自体を検索キーとする類似症例画像検索エンジン（類似症例画像検索モデル、類似症例画像検索用の学習済モデル）が用いられてもよい。例えば、画像処理装置１０１は、（高画質化用の学習済モデルとは異なる）類似症例画像検索用の学習済モデルを用いて、種々の医用画像から該医用画像に関連する類似症例画像の検索を行うことができる。 Also, similar case image retrieval may be performed using an external database stored in a server or the like, using the analysis results, diagnosis results, and the like obtained by processing the learned model as described above as retrieval keys. Further, a similar case image search may be performed using an external database stored in a server or the like using object recognition results, segmentation results, and the like obtained by processing various learned models as described above as search keys. In addition, when multiple medical images stored in a database are already managed in a state in which the feature values of each of the multiple medical images are attached as incidental information by machine learning, etc., the medical images themselves A similar case image search engine (similar case image search model, trained model for similar case image search) may be used as a search key. For example, the image processing apparatus 101 uses a trained model for similar case image retrieval (different from a trained model for high image quality) to retrieve similar case images related to the medical image from various medical images. It can be performed.

また、例えば、表示制御部１０１－５は、種々の医用画像から類似症例画像検索用の学習済モデルを用いて得た類似症例画像を出力部１０３に表示させることができる。このとき、類似症例画像は、例えば、学習済モデルに入力された医用画像の特徴量と類似する特徴量の画像である。また、類似症例画像は、例えば、学習済モデルに入力された医用画像において異常部位等の部分領域が含まれる場合には、異常部位等の部分領域の特徴量と類似する特徴量の画像である。このため、例えば、類似症例画像を精度よく検索するための学習を効率的に行うことができるだけでなく、医用画像において異常部位が含まれる場合には、検者は異常部位の診断を効率よく行うことができる。また、複数の類似症例画像が検索されてもよく、特徴量が類似する順番が識別可能に複数の類似症例画像が表示されてもよい。また、複数の類似症例画像のうち、検者からの指示に応じて選択された画像と該画像との特徴量とを含む学習データを用いて、類似症例画像検索用の学習済モデルが追加学習されるように構成されてもよい。 Further, for example, the display control unit 101-5 can cause the output unit 103 to display similar case images obtained from various medical images using a trained model for similar case image retrieval. At this time, the similar case image is, for example, an image with a feature amount similar to the feature amount of the medical image input to the trained model. In addition, for example, when a partial region such as an abnormal site is included in the medical image input to the trained model, the similar case image is an image with a feature amount similar to the feature amount of the partial region such as the abnormal site. . For this reason, for example, it is possible not only to efficiently perform learning for accurately retrieving similar case images, but also to efficiently diagnose an abnormal site when an abnormal site is included in the medical image. be able to. Also, a plurality of similar case images may be retrieved, and the plurality of similar case images may be displayed such that the order in which the feature amounts are similar can be identified. In addition, the trained model for similar case image retrieval is additionally trained using learning data including an image selected according to an instruction from the examiner and the feature amount of the image from among the plurality of similar case images. may be configured to be

また、各種学習済モデルの学習データは、実際の撮影を行う眼科装置自体を用いて得たデータに限られず、所望の構成に応じて、同型の眼科装置を用いて得たデータや、同種の眼科装置を用いて得たデータ等であってもよい。 In addition, the learning data of various trained models is not limited to data obtained using the ophthalmologic apparatus itself that actually performs imaging. Data or the like obtained using an ophthalmologic apparatus may be used.

なお、上述した実施形態及び変形例に係る各種学習済モデルは画像処理装置１０１に設けられることができる。学習済モデルは、例えば、ＣＰＵや、ＭＰＵ、ＧＰＵ、ＦＰＧＡ等のプロセッサーによって実行されるソフトウェアモジュール等で構成されてもよいし、ＡＳＩＣ等の特定の機能を果たす回路等によって構成されてもよい。また、これら学習済モデルは、画像処理装置１０１と接続される別のサーバの装置等に設けられてもよい。この場合には、画像処理装置１０１は、インターネット等の任意のネットワークを介して学習済モデルを備えるサーバ等に接続することで、学習済モデルを用いることができる。ここで、学習済モデルを備えるサーバは、例えば、クラウドサーバや、フォグサーバ、エッジサーバ等であってよい。なお、施設内や、施設が含まれる敷地内、複数の施設が含まれる地域内等のネットワークを無線通信可能に構成する場合には、例えば、施設や、敷地、地域等に限定で割り当てられた専用の波長帯域の電波を用いるように構成することで、ネットワークの信頼性を向上させてもよい。また、高速や、大容量、低遅延、多数同時接続が可能な無線通信によりネットワークが構成されてもよい。 Note that various trained models according to the above-described embodiments and modifications can be provided in the image processing apparatus 101 . A trained model may be configured by, for example, a software module or the like executed by a processor such as a CPU, MPU, GPU, or FPGA, or may be configured by a circuit or the like that performs a specific function such as an ASIC. Also, these learned models may be provided in a device such as another server connected to the image processing device 101 . In this case, the image processing apparatus 101 can use the trained model by connecting to a server or the like having the trained model via an arbitrary network such as the Internet. Here, the server provided with the learned model may be, for example, a cloud server, a fog server, an edge server, or the like. In addition, when configuring a network within a facility, within a site that includes a facility, within an area that includes multiple facilities, etc., for wireless communication, for example, Reliability of the network may be improved by configuring to use radio waves of a dedicated wavelength band. Alternatively, the network may be configured by wireless communication capable of high speed, large capacity, low delay, and multiple simultaneous connections.

［変形例１１］
上述した様々な実施形態及び変形例による画像処理装置１０１によって処理される医用画像は、任意のモダリティ（撮影装置、撮影方法）を用いて取得された画像を含む。処理される医用画像は、任意の撮影装置等で取得された医用画像や、医用画像処理装置又は医用画像処理方法によって作成された画像を含むことができる。 [Modification 11]
Medical images processed by the image processing apparatus 101 according to the various embodiments and modifications described above include images acquired using any modality (imaging device, imaging method). Medical images to be processed can include medical images acquired by any imaging device or the like, and images created by a medical image processing device or a medical image processing method.

さらに、処理される医用画像は、被検者（被検体）の所定部位の画像であり、所定部位の画像は被検者の所定部位の少なくとも一部を含む。また、当該医用画像は、被検者の他の部位を含んでもよい。また、医用画像は、静止画像又は動画像であってよく、白黒画像又はカラー画像であってもよい。さらに医用画像は、所定部位の構造（形態）を表す画像でもよいし、その機能を表す画像でもよい。機能を表す画像は、例えば、ＯＣＴＡ画像、ドップラーＯＣＴ画像、ｆＭＲＩ画像、及び超音波ドップラー画像等の血流動態（血流量、血流速度等）を表す画像を含む。なお、被検者の所定部位は、撮影対象に応じて決定されてよく、人眼（被検眼）、脳、肺、腸、心臓、すい臓、腎臓、及び肝臓等の臓器、頭部、胸部、脚部、並びに腕部等の任意の部位を含む。 Furthermore, the medical image to be processed is an image of a predetermined region of a subject (subject), and the image of the predetermined region includes at least part of the predetermined region of the subject. Also, the medical image may include other parts of the subject. Also, the medical image may be a still image or a moving image, and may be a black-and-white image or a color image. Further, the medical image may be an image representing the structure (morphology) of a predetermined site, or an image representing its function. Images representing functions include, for example, images representing blood flow dynamics (blood flow, blood flow velocity, etc.) such as OCTA images, Doppler OCT images, fMRI images, and ultrasound Doppler images. In addition, the predetermined part of the subject may be determined according to the object to be imaged. Includes optional parts such as legs and arms.

また、医用画像は、被検者の断層画像であってもよいし、正面画像であってもよい。正面画像は、例えば、眼底又は前眼部のＳＬＯ画像、蛍光撮影された眼底画像、ＯＣＴで取得したデータ（３次元のＯＣＴデータ）について撮影対象の深さ方向における少なくとも一部の範囲のデータを用いて生成したＥｎ－Ｆａｃｅ画像を含む。Ｅｎ－Ｆａｃｅ画像は、３次元のＯＣＴＡデータ（３次元のモーションコントラストデータ）について撮影対象の深さ方向における少なくとも一部の範囲のデータを用いて生成したＯＣＴＡのＥｎ－Ｆａｃｅ画像（モーションコントラスト正面画像）であってもよい。また、３次元のＯＣＴデータや３次元のモーションコントラストデータは、３次元の医用画像データの一例である。 Further, the medical image may be a tomographic image or a front image of the subject. The front image is, for example, an SLO image of the fundus or the anterior segment of the eye, a fundus image obtained by fluorescence photography, or data acquired by OCT (three-dimensional OCT data), and data of at least a part of the range in the depth direction of the object to be photographed. Contains En-Face images generated using The En-Face image is an OCTA En-Face image (motion contrast frontal image) generated using data of at least a partial range in the depth direction of the object to be imaged for three-dimensional OCTA data (three-dimensional motion contrast data). ). Three-dimensional OCT data and three-dimensional motion contrast data are examples of three-dimensional medical image data.

ここで、モーションコントラストデータとは、被検眼の同一領域（同一位置）において測定光が複数回走査されるように制御して得た複数のボリュームデータ間での変化を示すデータである。このとき、ボリュームデータは、異なる位置で得た複数の断層画像により構成される。そして、異なる位置それぞれにおいて、略同一位置で得た複数の断層画像の間での変化を示すデータを得ることで、モーションコントラストデータをボリュームデータとして得ることができる。なお、モーションコントラスト正面画像は、血流の動きを測定するＯＣＴアンギオグラフィ（ＯＣＴＡ）に関するＯＣＴＡ正面画像（ＯＣＴＡのＥｎ－Ｆａｃｅ画像）とも呼ばれ、モーションコントラストデータはＯＣＴＡデータとも呼ばれる。モーションコントラストデータは、例えば、２枚の断層画像又はこれに対応する干渉信号間の脱相関値、分散値、又は最大値を最小値で割った値（最大値／最小値）として求めることができ、公知の任意の方法により求められてよい。このとき、２枚の断層画像は、例えば、被検眼の同一領域（同一位置）において測定光が複数回走査されるように制御して得ることができる。なお、略同一位置を測定光が複数回走査されるように走査手段を制御する際に、一つの走査（一つのＢスキャン）と次の走査（次のＢスキャン）との時間間隔（タイムインターバル）が変更（決定）されるように構成されてもよい。これにより、例えば、血管の状態によって血流速度が異なる場合があっても、血管領域を精度よく可視化することができる。このとき、例えば、検者からの指示に応じて、上記時間間隔が変更可能に構成されてもよい。また、例えば、検者からの指示に応じて、予め設定されている複数の時間間隔に対応する複数のモーションコントラスト画像から、いずれかのモーションコントラスト画像が選択可能に構成されてもよい。また、例えば、モーションコントラストデータを取得した際の時間間隔と該モーションコントラストデータとを対応づけて記憶部１０１－３に記憶可能に構成されてもよい。また、例えば、表示制御部１０１－５、モーションコントラストデータを取得した際の時間間隔と該モーションコントラストデータに対応するモーションコントラスト画像とを出力部１０３に表示させてもよい。また、例えば、上記時間間隔が自動的に決定、あるいは上記時間間隔の少なくとも１つの候補が決定されるように構成されてもよい。このとき、例えば、機械学習モデルを用いて、モーションコントラスト画像から、上記時間間隔が決定（出力）されるように構成されてもよい。このような機械学習モデルは、例えば、複数の時間間隔に対応する複数のモーションコントラスト画像を入力データとし、該複数の時間間隔から所望のモーションコントラスト画像を取得した際の時間間隔までの差を正解データとする学習データを学習することにより得ることができる。 Here, the motion contrast data is data indicating changes between a plurality of volume data obtained by controlling the measurement light to scan the same region (same position) of the subject's eye a plurality of times. At this time, the volume data is composed of a plurality of tomographic images obtained at different positions. Then, motion contrast data can be obtained as volume data by obtaining data indicating changes between a plurality of tomographic images obtained at approximately the same position at each different position. Note that the motion contrast frontal image is also called an OCTA frontal image (OCTA En-Face image) related to OCT angiography (OCTA) for measuring the movement of blood flow, and the motion contrast data is also called OCTA data. Motion contrast data can be obtained, for example, as a decorrelation value between two tomographic images or their corresponding interference signals, a variance value, or a value obtained by dividing the maximum value by the minimum value (maximum value/minimum value). , may be determined by any known method. At this time, the two tomographic images can be obtained, for example, by controlling the measurement light to scan the same region (same position) of the subject's eye a plurality of times. When controlling the scanning means so that the measurement light scans approximately the same position a plurality of times, the time interval between one scan (one B scan) and the next scan (next B scan) ) may be changed (determined). As a result, for example, even if the blood flow velocity varies depending on the state of the blood vessel, the blood vessel region can be visualized with high accuracy. At this time, for example, the time interval may be configured to be changeable according to an instruction from the examiner. Further, for example, any motion contrast image may be selectable from a plurality of motion contrast images corresponding to a plurality of preset time intervals according to an instruction from the examiner. Further, for example, the time interval when the motion contrast data is acquired and the motion contrast data may be associated with each other and stored in the storage unit 101-3. Further, for example, the display control unit 101-5 may cause the output unit 103 to display the time interval when the motion contrast data is acquired and the motion contrast image corresponding to the motion contrast data. Also, for example, the time interval may be determined automatically, or at least one candidate for the time interval may be determined. At this time, for example, a machine learning model may be used to determine (output) the time interval from the motion contrast image. Such a machine learning model, for example, uses a plurality of motion contrast images corresponding to a plurality of time intervals as input data, and corrects the difference from the plurality of time intervals to the time interval when the desired motion contrast image is acquired. It can be obtained by learning learning data to be data.

また、Ｅｎ－Ｆａｃｅ画像は、例えば、２つの層境界の間の範囲のデータをＸＹ方向に投影して生成した正面画像である。このとき、正面画像は、光干渉を用いて得たボリュームデータ（３次元の断層画像）の少なくとも一部の深度範囲であって、２つの基準面に基づいて定められた深度範囲に対応するデータを２次元平面に投影又は積算して生成される。Ｅｎ－Ｆａｃｅ画像は、ボリュームデータのうちの、検出された網膜層に基づいて決定された深度範囲に対応するデータを２次元平面に投影して生成された正面画像である。なお、２つの基準面に基づいて定められた深度範囲に対応するデータを２次元平面に投影する手法としては、例えば、当該深度範囲内のデータの代表値を２次元平面上の画素値とする手法を用いることができる。ここで、代表値は、２つの基準面に囲まれた領域の深さ方向の範囲内における画素値の平均値、中央値又は最大値などの値を含むことができる。また、Ｅｎ－Ｆａｃｅ画像に係る深度範囲は、例えば、検出された網膜層に関する２つの層境界の一方を基準として、より深い方向又はより浅い方向に所定の画素数分だけ含んだ範囲であってもよい。また、Ｅｎ－Ｆａｃｅ画像に係る深度範囲は、例えば、検出された網膜層に関する２つの層境界の間の範囲から、操作者の指示に応じて変更された（オフセットされた）範囲であってもよい。 An En-Face image is, for example, a front image generated by projecting the data of the range between two layer boundaries in the XY directions. At this time, the front image is at least a partial depth range of volume data (three-dimensional tomographic image) obtained using optical interference, and is data corresponding to the depth range determined based on the two reference planes. is generated by projecting or integrating on a two-dimensional plane. The En-Face image is a front image generated by projecting data corresponding to a depth range determined based on the detected retinal layers out of the volume data onto a two-dimensional plane. As a method of projecting data corresponding to a depth range determined based on two reference planes onto a two-dimensional plane, for example, a representative value of data within the depth range is used as a pixel value on the two-dimensional plane. method can be used. Here, the representative value can include a value such as an average value, a median value, or a maximum value of pixel values within the range in the depth direction of the area surrounded by the two reference planes. Further, the depth range of the En-Face image is, for example, a range including a predetermined number of pixels in a deeper direction or a shallower direction with respect to one of two layer boundaries regarding the detected retinal layer. good too. Further, the depth range related to the En-Face image is, for example, a range changed (offset) according to the operator's instruction from the range between two layer boundaries related to the detected retinal layer. good.

また、撮影装置とは、診断に用いられる画像を撮影するための装置である。撮影装置は、例えば、被検者の所定部位に光、Ｘ線等の放射線、電磁波、又は超音波等を照射することにより所定部位の画像を得る装置や、被写体から放出される放射線を検出することにより所定部位の画像を得る装置を含む。より具体的には、上述した様々な実施形態及び変形例に係る撮影装置は、少なくとも、Ｘ線撮影装置、ＣＴ装置、ＭＲＩ装置、ＰＥＴ装置、ＳＰＥＣＴ装置、ＳＬＯ装置、ＯＣＴ装置、ＯＣＴＡ装置、眼底カメラ、及び内視鏡等を含む。なお、上述の各実施形態や変形例に係る構成を、これら撮影装置に適用することができる。この場合、上述の予測すべき被検眼の動きに対応する被検体の動きとしては、例えば、顔や体の動き、心臓の動き（心拍）等であってよい。 Also, the imaging device is a device for capturing an image used for diagnosis. The imaging device is, for example, a device that obtains an image of a predetermined portion of a subject by irradiating the predetermined portion of the subject with radiation such as light, X-rays, electromagnetic waves, or ultrasonic waves, or detects radiation emitted from the subject. It includes a device for obtaining an image of a predetermined site by means of More specifically, the imaging apparatus according to the various embodiments and modifications described above includes at least an X-ray imaging apparatus, a CT apparatus, an MRI apparatus, a PET apparatus, a SPECT apparatus, an SLO apparatus, an OCT apparatus, an OCTA apparatus, and a fundus. Including cameras and endoscopes. Note that the configurations according to the above-described embodiments and modifications can be applied to these imaging devices. In this case, the movement of the subject corresponding to the movement of the subject's eye to be predicted may be, for example, the movement of the face or body, the movement of the heart (heartbeat), or the like.

なお、ＯＣＴ装置としては、タイムドメインＯＣＴ（ＴＤ－ＯＣＴ）装置やフーリエドメインＯＣＴ（ＦＤ－ＯＣＴ）装置を含んでよい。また、フーリエドメインＯＣＴ装置はスペクトラルドメインＯＣＴ（ＳＤ－ＯＣＴ）装置や波長掃引型ＯＣＴ（ＳＳ－ＯＣＴ）装置を含んでよい。また、ＯＣＴ装置は、ライン光を用いたＬｉｎｅ－ＯＣＴ装置（あるいはＳＳ－Ｌｉｎｅ－ＯＣＴ装置）を含んでよい。また、ＯＣＴ装置は、エリア光を用いたＦｕｌｌＦｉｅｌｄ－ＯＣＴ装置（あるいはＳＳ－ＦｕｌｌＦｉｅｌｄ－ＯＣＴ装置）を含んでよい。また、ＯＣＴ装置は、Ｄｏｐｐｌｅｒ－ＯＣＴ装置を含んでよい。また、ＳＬＯ装置やＯＣＴ装置として、波面補償光学系を用いた波面補償ＳＬＯ（ＡＯ－ＳＬＯ）装置や波面補償ＯＣＴ（ＡＯ－ＯＣＴ）装置等を含んでよい。また、ＳＬＯ装置やＯＣＴ装置として、偏光位相差や偏光解消に関する情報を可視化するための偏光ＳＬＯ（ＰＳ－ＳＬＯ）装置や偏光ＯＣＴ（ＰＳ－ＯＣＴ）装置等を含んでよい。また、ＳＬＯ装置やＯＣＴ装置として、病理顕微鏡ＳＬＯ装置や病理顕微鏡ＯＣＴ装置等を含んでよい。また、ＳＬＯ装置やＯＣＴ装置として、ハンドヘルド型のＳＬＯ装置やハンドヘルド型のＯＣＴ装置等を含んでよい。また、ＳＬＯ装置やＯＣＴ装置として、カテーテルＳＬＯ装置やカテーテルＯＣＴ装置等を含んでよい。また、ＳＬＯ装置やＯＣＴ装置として、ヘッドマウント型のＳＬＯ装置やヘッドマウント型のＯＣＴ装置等を含んでよい。また、ＳＬＯ装置やＯＣＴ装置として、双眼鏡型のＳＬＯ装置や双眼鏡型のＯＣＴ装置等を含んでよい。また、ＳＬＯ装置やＯＣＴ装置は、光学変倍可能な構成によって、撮影画角を変更可能なものであってもよい。また、ＳＬＯ装置は、ＲＧＢの各光源を用いて、１つの受光素子で時分割に受光する構成又は複数の受光素子で同時に受光する構成によって、カラー画像や蛍光画像を取得可能なものであってもよい。 The OCT apparatus may include a time domain OCT (TD-OCT) apparatus and a Fourier domain OCT (FD-OCT) apparatus. Fourier-domain OCT devices may also include spectral-domain OCT (SD-OCT) devices and wavelength-swept OCT (SS-OCT) devices. Also, the OCT apparatus may include a Line-OCT apparatus using line light (or an SS-Line-OCT apparatus). Also, the OCT apparatus may include a Full Field-OCT apparatus using area light (or an SS-Full Field-OCT apparatus). Also, the OCT device may include a Doppler-OCT device. Further, the SLO device and the OCT device may include a wavefront compensation SLO (AO-SLO) device and a wavefront compensation OCT (AO-OCT) device using a wavefront compensation optical system. In addition, the SLO device and the OCT device may include a polarization SLO (PS-SLO) device and a polarization OCT (PS-OCT) device for visualizing information on polarization phase difference and depolarization. Also, the SLO device and the OCT device may include a pathological microscope SLO device, a pathological microscope OCT device, and the like. Also, the SLO device and the OCT device may include a handheld SLO device, a handheld OCT device, and the like. Also, the SLO device and the OCT device may include a catheter SLO device, a catheter OCT device, and the like. Further, the SLO device and the OCT device may include a head-mounted SLO device, a head-mounted OCT device, and the like. Also, the SLO device and the OCT device may include a binocular SLO device, a binocular OCT device, and the like. Also, the SLO device and the OCT device may be of a configuration capable of optically varying the magnification so that the imaging angle of view can be changed. In addition, the SLO device can acquire a color image or a fluorescence image by using each light source of RGB and by a configuration in which light is received by one light receiving element in a time division manner or by a configuration in which light is received by a plurality of light receiving elements at the same time. good too.

また、上述の実施形態及び変形例では、画像処理装置１０１はＯＣＴ装置の一部として構成されているが、画像処理装置１０１はＯＣＴ装置と別体として構成されてもよい。この場合、画像処理装置１０１は、ＯＣＴ装置の撮影装置１００等とインターネット等を介して接続されてもよい。また、ＯＣＴ装置の構成は、上記の構成に限られず、ＯＣＴ装置に含まれる構成の一部を、例えばＳＬＯ撮影部等をＯＣＴ装置と別体の構成としてもよい。 Further, in the above-described embodiments and modifications, the image processing apparatus 101 is configured as part of the OCT apparatus, but the image processing apparatus 101 may be configured separately from the OCT apparatus. In this case, the image processing apparatus 101 may be connected to the imaging apparatus 100 of the OCT apparatus via the Internet or the like. Also, the configuration of the OCT apparatus is not limited to the configuration described above, and a part of the configuration included in the OCT apparatus, for example, the SLO imaging unit may be configured separately from the OCT apparatus.

なお、音声認識用や文字認識用、ジェスチャー認識用等の学習済モデルでは、入力される連続する時系列のデータ値間の傾きを特徴量の一部として抽出し、推定処理に用いているものと考えられる。このような学習済モデルは、具体的な数値の時間的な変化による影響を推定処理に用いることで、精度のよい推定を行うことができると期待される。また、上述の実施形態及び変形例に係る、高画質化用、領域認識用、セグメンテーション処理用、画像解析用、診断結果生成用の学習済モデルでも、断層画像の輝度値の大小、明部と暗部の順番や傾き、位置、分布、連続性等を特徴量の一部として抽出して、推定処理に用いているものと考えらえる。 Note that in trained models for voice recognition, character recognition, gesture recognition, etc., the inclination between input continuous time-series data values is extracted as part of the feature quantity and used for estimation processing. it is conceivable that. Such a trained model is expected to be able to perform accurate estimation by using the effects of temporal changes in specific numerical values in estimation processing. In addition, even in the trained models for image quality improvement, region recognition, segmentation processing, image analysis, and diagnosis result generation according to the above-described embodiments and modifications, the magnitude of the luminance value of the tomographic image, the bright portion, and the It can be considered that the order, inclination, position, distribution, continuity, etc. of the dark areas are extracted as part of the feature amount and used for the estimation process.

（その他の実施形態）
本発明は、上述した様々な実施形態及び変形例の１以上の機能を実現するソフトウェア（プログラム）を、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータがプログラムを読出し実行する処理でも実現可能である。コンピュータは、１つ又は複数のプロセッサー若しくは回路を有し、コンピュータ実行可能命令を読み出し実行するために、分離した複数のコンピュータ又は分離した複数のプロセッサー若しくは回路のネットワークを含みうる。 (Other embodiments)
The present invention supplies software (program) that implements one or more functions of the various embodiments and modifications described above to a system or device via a network or a storage medium, and the computer of the system or device executes the program. It can also be realized by a process of reading and executing. A computer has one or more processors or circuits and may include separate computers or a network of separate processors or circuits for reading and executing computer-executable instructions.

このとき、プロセッサー又は回路は、中央演算処理装置（ＣＰＵ）、マイクロプロセッシングユニット（ＭＰＵ）、グラフィクスプロセッシングユニット（ＧＰＵ）、特定用途向け集積回路（ＡＳＩＣ）、又はフィールドプログラマブルゲートウェイ（ＦＰＧＡ）を含みうる。また、プロセッサー又は回路は、デジタルシグナルプロセッサ（ＤＳＰ）、データフロープロセッサ（ＤＦＰ）、又はニューラルプロセッシングユニット（ＮＰＵ）を含みうる。 As such, the processor or circuitry may include a central processing unit (CPU), a microprocessing unit (MPU), a graphics processing unit (GPU), an application specific integrated circuit (ASIC), or a field programmable gateway (FPGA). Also, the processor or circuitry may include a digital signal processor (DSP), data flow processor (DFP), or neural processing unit (NPU).

以上、実施形態及び変形例を参照して本発明について説明したが、本発明は上記実施形態及び変形例に限定されるものではない。本発明の趣旨に反しない範囲で変更された発明、及び本発明と均等な発明も本発明に含まれる。また、上述の各実施形態及び変形例は、本発明の趣旨に反しない範囲で適宜組み合わせることができる。 Although the present invention has been described with reference to the embodiments and modifications, the present invention is not limited to the above embodiments and modifications. Inventions modified within the scope of the present invention and inventions equivalent to the present invention are also included in the present invention. Moreover, the above-described embodiments and modifications can be appropriately combined within the scope of the present invention.

Claims

A trained model obtained by learning with learning data including input data that is a medical image having a first image size, and is a medical image having a second image size that is larger than the first image size. An image processing apparatus comprising an image processing unit that outputs a second image having the second image size as output data by inputting the first image as input data to the trained model.

The image processing unit uses, as input data, a plurality of medical images having the second image size obtained by dividing a medical image having a third image size larger than the second image size into a plurality of images. 2. The image processing apparatus according to claim 1, wherein a plurality of images corresponding to said plurality of medical images are output as output data by inputting to said trained model.

3. The image processing apparatus according to claim 2, wherein said image processing unit generates an image of said third image size by synthesizing said plurality of output images.

4. The image processing apparatus according to claim 1, wherein said trained model includes an Encoder-Decoder type Transformer.

The image processing unit processes the target region in the first image or the second image so that the difference between the pixel value of the target region and the pixel value of the region other than the target region widens, and the pixels of the target region 5. The image processing apparatus according to any one of claims 1 to 4, wherein image processing is performed so that the value becomes lower.

The image processing is performed on the first image and the second image so that the difference between the pixel values of the target region and the pixel values of the regions other than the target region widens and the pixel values of the target region become lower. 6. The image processing apparatus according to claim 5, further comprising a process of blending between and.

The image processing unit detects a target region in a medical image using the first image or the second image, and detects a target region in the second image corresponding to the detected target region in the first image. 7. The image processing apparatus according to claim 5, wherein said image processing is performed on a region.

The image processing device according to any one of claims 1 to 7, wherein the image processing unit detects a target area using the trained model.

The medical image is a frontal image,
The image processing unit detects a two-dimensional target region using the learned model obtained by learning using a plurality of front images of the subject corresponding to a plurality of depth ranges as learning data. 8. The image processing device according to any one of 7.

The medical image is a frontal image,
The image processing unit selects according to an instruction from an examiner from among a plurality of learned models obtained by learning using a plurality of front images of the subject corresponding to a plurality of depth ranges as learning data. 8. The image processing device according to any one of claims 1 to 7, wherein a trained model corresponding to a depth range is selected, and the selected trained model is used to detect a two-dimensional target region.

The medical image is a three-dimensional medical image,
8. The image processing unit detects a three-dimensional target region using the learned model obtained by learning using a three-dimensional medical image of a subject as learning data. The image processing device according to .

The image processing apparatus according to any one of claims 1 to 7, wherein the image processing unit uses the first image to detect the target region by rule-based processing based on the structure of the subject.

13. The image processing according to any one of claims 1 to 12, wherein the image processing includes processing for correcting at least one of brightness and contrast such that a difference between pixel values of the target area and pixel values of areas other than the target area widens. The image processing device according to .

the medical image of the subject is a motion contrast image of the subject's eye,
14. The image processing apparatus according to any one of claims 1 to 13, wherein the target region includes at least one of an aperfused region, a foveal vessel region, and an optic disc region.

The output unit performs image quality improvement processing using a first trained model obtained by learning using the medical image of the subject as learning data, and blend processing or brightness and contrast in the target region of the medical image of the subject. Using a second trained model obtained by learning using the medical image of the subject that has undergone at least one of the correction processing as learning data, performing image quality improvement processing on the first image, 15. The image processing apparatus according to any one of claims 1 to 14, which outputs a second medical image of the subject.

The image processing apparatus according to any one of claims 1 to 15, further comprising a display control unit that causes a display unit to display a live moving image of the subject.

The display control unit is a high-quality image generated by using a trained model obtained by learning learning data including an image related to a subject, and is a high-quality image obtained by inputting the medical image of the subject. 17. The image processing apparatus according to claim 16, wherein an image is displayed on said display unit.

The display control unit causes the display unit to display the anterior eye image generated as the high-quality image as the live moving image, and generates the SLO image generated as the high-quality image as the high-quality image. an SLO image on which a line indicating the position of the obtained tomographic image is superimposed is displayed on the display unit as the live moving image, and the tomographic image corresponding to the position of the line on the SLO image is used as the live moving image. 18. The image processing device according to claim 17, which is displayed on said display unit.

The display control unit displays information indicating a blood vessel region in the tomographic image corresponding to the position of the line, which is the tomographic image generated as the high-quality image, superimposed on the tomographic image corresponding to the position of the line. 19. The image processing device according to claim 18, wherein

The display control unit is an analysis result generated using a learned model for generating analysis results obtained by learning learning data including an image related to the subject, and obtained by inputting an image related to the subject. 20. The image processing apparatus according to any one of claims 16 to 19, wherein the analysis result is displayed on the display unit.

The display control unit is a diagnostic result generated using a trained model for generating a diagnostic result obtained by learning learning data including an image related to the subject, and obtained by inputting an image related to the subject. 21. The image processing apparatus according to any one of claims 16 to 20, wherein a diagnosis result is displayed on said display unit.

The display control unit is an image generated using a hostile generation network or an autoencoder, the image obtained by inputting an image related to the subject, and the image input to the hostile generation network or the autoencoder. 22. The image processing apparatus according to any one of claims 16 to 21, wherein the display unit displays information about the difference from the image of the subject as information about the abnormal site.

The display control unit inputs an image of a subject, which is a similar case image retrieved using a trained model for similar case image retrieval obtained by learning learning data including an image of the subject. The image processing apparatus according to any one of claims 16 to 22, wherein the obtained similar case images are displayed on the display unit.

The display control unit is an object recognition result or a segmentation result generated using a trained model for object recognition or a trained model for segmentation obtained by learning learning data including an image related to a subject, 24. The image processing apparatus according to any one of claims 16 to 23, wherein an object recognition result or a segmentation result obtained by inputting an image of a subject is displayed on the display unit.

The operator's instruction for obtaining the second image uses at least one trained model among a trained model for character recognition, a trained model for speech recognition, and a trained model for gesture recognition. 25. The image processing apparatus according to any one of claims 1 to 24, wherein the information is obtained by

A trained model obtained by learning with learning data including input data that is a medical image having a first image size, and is a medical image having a second image size that is larger than the first image size. An image processing method, comprising a step of outputting a second image having the second image size as output data by inputting the first image as input data to the trained model.

A program that, when executed by a computer, causes the computer to perform each step of the image processing method according to claim 26.