JP7284688B2

JP7284688B2 - Image processing device, image processing method, image processing program and recording medium

Info

Publication number: JP7284688B2
Application number: JP2019198342A
Authority: JP
Inventors: 智親竹嶋; 二三生橋本; 貴文樋口
Original assignee: Hamamatsu Photonics KK
Current assignee: Hamamatsu Photonics KK
Priority date: 2019-10-31
Filing date: 2019-10-31
Publication date: 2023-05-31
Anticipated expiration: 2039-10-31
Also published as: JP2021071936A

Description

本発明は、画像処理装置、画像処理方法、画像処理プログラムおよび記録媒体に関するものである。 The present invention relates to an image processing device, an image processing method, an image processing program, and a recording medium.

多くの場合、撮像等により取得された画像にはノイズが重畳されている。例えば、蛍光顕微鏡等の光学装置により撮像して取得される画像は、露光時間が短いほど、多くのノイズが重畳されている。また、ＰＥＴ装置、Ｘ線ＣＴ装置およびＭＲＩ装置等において再構成処理により作成される断層画像にも、多くのノイズが重畳されている。対象画像に重畳されるノイズを低減する様々な技術が知られている。 In many cases, noise is superimposed on an image acquired by imaging or the like. For example, the shorter the exposure time, the more noise is superimposed on an image captured by an optical device such as a fluorescence microscope. A lot of noise is also superimposed on a tomographic image created by reconstruction processing in a PET device, an X-ray CT device, an MRI device, or the like. Various techniques are known for reducing noise superimposed on a target image.

メディアン方式によるノイズ低減技術は、対象画像の各画素について周辺画素の輝度の並び替えにより輝度が大きく異なるものがないようにする。しかし、このノイズ低減技術は、画像の解像度を低下させ、画像の変形を伴う等の問題点を有している。 The noise reduction technique based on the median method prevents pixels of a target image from having a large luminance difference by rearranging the luminance of surrounding pixels. However, this noise reduction technique has problems such as lowering the resolution of the image and deforming the image.

スムージング方式によるノイズ低減技術は、その原理上、画像の空間分解能を低下させるという問題点を有している。ただし、Non-Local Meansフィルタ等のエッジ保存タイプのノイズ低減技術が提案されたことにより、対象画像においてエッジを残しながら、エッジでない箇所を滑らかにすることが可能になった。 The noise reduction technique based on the smoothing method has a problem of lowering the spatial resolution of the image due to its principle. However, the proposal of edge-preserving noise reduction techniques such as the Non-Local Means filter has made it possible to smooth non-edge portions while leaving edges in the target image.

また、近年では、深層ニューラルネットワークの一種である畳み込みニューラルネットワーク（Convolutional Neural Network：ＣＮＮ）を利用したDeep Image Prior技術が注目されている。このノイズ低減技術は、対象画像中の意味のある構造の方がランダムなノイズより早く学習される（すなわち、ランダムなノイズは学習されにくい）というＣＮＮの性質を利用して、対象画像のノイズを低減する（非特許文献１～３）。 Also, in recent years, attention has been focused on Deep Image Prior technology using a convolutional neural network (CNN), which is a type of deep neural network. This noise reduction technique utilizes the property of CNN that meaningful structures in target images are learned faster than random noise (that is, random noise is difficult to learn), and reduces noise in target images. (Non-Patent Documents 1 to 3).

Dmitry Ulyanov, et al, “DeepImage Prior”, ［online］，［令和元年１０月８日検索］，インターネット＜URL：https://dmitryulyanov.github.io/deep_image_prior＞Dmitry Ulyanov, et al, “DeepImage Prior”, [online], [searched on October 8, 2019], Internet <URL: https://dmitryulyanov.github.io/deep_image_prior> Dmitry Ulyanov, et al, “DeepImage Prior”, ［online］，［令和元年１０月８日検索］，インターネット＜URL：https://sites.skoltech.ru/app/data/uploads/sites/25/2018/04/deep_image_prior.pdf＞Dmitry Ulyanov, et al, “DeepImage Prior”, [online], [searched on October 8, 2019], Internet <URL: https://sites.skoltech.ru/app/data/uploads/sites/25 /2018/04/deep_image_prior.pdf＞ Dmitry Ulyanov, et al, “DeepImage Prior”, ［online］，［令和元年１０月８日検索］，インターネット＜URL：https://box.skoltech.ru/index.php/s/ib52BOoV58ztuPM#pdfviewer＞Dmitry Ulyanov, et al, “DeepImage Prior”, [online], [searched on October 8, 2019], Internet <URL: https://box.skoltech.ru/index.php/s/ib52BOoV58ztuPM#pdfviewer ＞ Jaakko Lehtinen，et al, “Noise2Noise: Learning Image Restoration without Clean Data,”［online］，［令和元年１０月８日検索］，インターネット＜https://arxiv.org/abs/1803.04189＞Jaakko Lehtinen, et al, “Noise2Noise: Learning Image Restoration without Clean Data,” [online], [searched on October 8, 2019], Internet <https://arxiv.org/abs/1803.04189>

従来のノイズ低減技術は、ノイズを低減したい対象画像が一つしかない場合や、対象画像のＳＮ比が低い場合（対象画像において信号成分に対してノイズが無視できない比率で重畳している場合）には、ノイズを効果的に低減することが困難である。 Conventional noise reduction technology is used when there is only one target image for which noise is to be reduced, or when the SN ratio of the target image is low (when noise is superimposed on the signal component in the target image at a ratio that cannot be ignored). However, it is difficult to reduce noise effectively.

本発明は、上記問題点を解消する為になされたものであり、対象画像が一つしかない場合や対象画像のＳＮ比が低い場合であっても該対象画像のノイズを効果的に低減することができる画像処理装置および画像処理方法を提供することを目的とする。 The present invention has been made to solve the above problems, and effectively reduces noise in a target image even when there is only one target image or when the SN ratio of the target image is low. It is an object of the present invention to provide an image processing apparatus and an image processing method that can

本発明の画像処理装置は、(1) 複数のランダムノイズ画像それぞれについて、該ランダムノイズ画像を入力画像とし対象画像を教師画像として畳み込みニューラルネットワークを繰り返し学習させ、その繰り返し学習の後に畳み込みニューラルネットワークから出力される画像を中間画像として取得する第１処理部と、(2) 第１処理部により取得された複数の中間画像に基づいて、対象画像からノイズが低減された出力画像を作成する第２処理部と、を備える。 The image processing apparatus of the present invention includes: (1) For each of a plurality of random noise images, the convolutional neural network is repeatedly trained using the random noise image as an input image and the target image as a teacher image, and after the repeated learning, the convolutional neural network (2) a first processing unit that acquires an image to be output as an intermediate image; and a processing unit.

本発明の画像処理装置の一側面では、第１処理部は、繰り返し学習の過程で畳み込みニューラルネットワークから出力される画像と対象画像との間の差が目標範囲内になったときに繰り返し学習を終了し、その終了時に畳み込みニューラルネットワークから出力される画像を中間画像として取得するのが好適である。第１処理部は、畳み込みニューラルネットワークとして非エンコーダ・デコーダ形式のものを用いるのが好適である。第１処理部は、複数のランダムノイズ画像それぞれについて中間画像を取得する処理のうちの何れか２以上の処理を並列的に行うのが好適である。 In one aspect of the image processing apparatus of the present invention, the first processing unit performs iterative learning when the difference between the image output from the convolutional neural network and the target image falls within a target range in the process of iterative learning. It is preferable to obtain the image that is finished and output from the convolutional neural network at the time of its finish as the intermediate image. The first processing unit preferably uses a non-encoder/decoder type convolutional neural network. It is preferable that the first processing unit performs in parallel any two or more of the processes of obtaining intermediate images for each of the plurality of random noise images.

本発明の画像処理装置の他の一側面では、第２処理部は、複数の中間画像のうちの何れかを入力画像とし他の何れかを教師画像としてNoise2Noise方式により畳み込みニューラルネットワークを学習させることで出力画像を作成するのが好適である。第２処理部は、複数の中間画像の平均をとることで出力画像を作成するのが好適である。 In another aspect of the image processing apparatus of the present invention, the second processing unit uses one of the plurality of intermediate images as an input image and one of the others as a teacher image to make the convolutional neural network learn by the Noise2Noise method. is preferably used to create the output image. Preferably, the second processing unit creates an output image by averaging a plurality of intermediate images.

本発明の画像処理方法は、(1) 複数のランダムノイズ画像それぞれについて、該ランダムノイズ画像を入力画像とし対象画像を教師画像として畳み込みニューラルネットワークを繰り返し学習させ、その繰り返し学習の後に畳み込みニューラルネットワークから出力される画像を中間画像として取得する第１処理ステップと、(2) 第１処理ステップにおいて取得された複数の中間画像に基づいて、対象画像からノイズが低減された出力画像を作成する第２処理ステップと、を備える。 In the image processing method of the present invention, (1) for each of a plurality of random noise images, the convolutional neural network is repeatedly trained using the random noise image as an input image and the target image as a teacher image, and after the iterative learning, the convolutional neural network (2) a first processing step of obtaining an image to be output as an intermediate image; and a processing step.

本発明の画像処理方法の一側面では、第１処理ステップにおいて、繰り返し学習の過程で畳み込みニューラルネットワークから出力される画像と対象画像との間の差が目標範囲内になったときに繰り返し学習を終了し、その終了時に畳み込みニューラルネットワークから出力される画像を中間画像として取得するのが好適である。第１処理ステップにおいて、畳み込みニューラルネットワークとして非エンコーダ・デコーダ形式のものを用いるのが好適である。第１処理ステップにおいて、複数のランダムノイズ画像それぞれについて中間画像を取得する処理のうちの何れか２以上の処理を並列的に行うのが好適である。 In one aspect of the image processing method of the present invention, in the first processing step, iterative learning is performed when the difference between the image output from the convolutional neural network and the target image in the process of iterative learning falls within a target range. It is preferable to obtain the image that is finished and output from the convolutional neural network at the time of its finish as the intermediate image. Preferably, in the first processing step, the convolutional neural network is of the non-encoder-decoder type. In the first processing step, it is preferable to perform in parallel any two or more of the processes of acquiring intermediate images for each of the plurality of random noise images.

本発明の画像処理方法の他の一側面では、第２処理ステップにおいて、複数の中間画像のうちの何れかを入力画像とし他の何れかを教師画像としてNoise2Noise方式により畳み込みニューラルネットワークを学習させることで出力画像を作成するのが好適である。第２処理ステップにおいて、複数の中間画像の平均をとることで出力画像を作成するのが好適である。 In another aspect of the image processing method of the present invention, in the second processing step, one of the plurality of intermediate images is used as an input image and any other of the intermediate images is used as a teacher image to train a convolutional neural network by a Noise2Noise method. is preferably used to create the output image. Preferably, in the second processing step, an output image is created by averaging a plurality of intermediate images.

本発明の画像処理プログラムは、上記の本発明の画像処理方法の第１処理ステップおよび第２処理ステップをコンピュータに実行させるためのものである。本発明の記録媒体は、上記の本発明の画像処理プログラムを記録したコンピュータ読み取り可能なものである。 An image processing program of the present invention is for causing a computer to execute the first processing step and the second processing step of the image processing method of the present invention. A recording medium of the present invention is a computer-readable medium recording the image processing program of the present invention.

本発明によれば、対象画像が一つしかない場合や対象画像のＳＮ比が低い場合であっても、該対象画像のノイズを効果的に低減することができる。 According to the present invention, even when there is only one target image or when the SN ratio of the target image is low, the noise of the target image can be effectively reduced.

図１は、本実施形態の画像処理装置１の構成を示す図である。FIG. 1 is a diagram showing the configuration of an image processing apparatus 1 of this embodiment. 図２は、本実施形態の画像処理方法のフローチャートである。FIG. 2 is a flow chart of the image processing method of this embodiment. 図３は、第１処理部１０の構成を示す図である。FIG. 3 is a diagram showing the configuration of the first processing unit 10. As shown in FIG. 図４は、第２処理部２０の構成を示す図である。FIG. 4 is a diagram showing the configuration of the second processing section 20. As shown in FIG. 図５（ａ）は、元画像を示す図である。図５（ｂ）は、対象画像Ａを示す図である。FIG. 5(a) is a diagram showing an original image. 5B is a diagram showing the target image A. FIG. 図６（ａ）は、ランダムノイズ画像Ｂ_ｎの一例を示す図である。図６（ｂ）は、中間画像Ｃ_ｎの一例を示す図である。FIG. 6(a) is a diagram showing an example of the random noise image _Bn . FIG. 6B is a diagram showing an example of the intermediate image _Cn . 図７は、中間画像Ｃ_１～Ｃ_１００のうちからランダムに選択した２個の中間画像の差の画像における輝度値の分布を示すヒストグラムである。FIG. 7 is a histogram showing the distribution of luminance values in difference images between two intermediate images randomly selected from the intermediate images C ₁ to C ₁₀₀ . 図８は、第１処理ステップにおけるＣＮＮの繰り返し学習の回数と、ＣＮＮから出力される画像と教師画像との間の差を表す損失関数値と、の関係を示すグラフである。FIG. 8 is a graph showing the relationship between the number of iterative learnings of the CNN in the first processing step and the loss function value representing the difference between the image output from the CNN and the teacher image. 図９は、図８の一部を拡大して示すグラフである。FIG. 9 is a graph showing an enlarged part of FIG. 図１０は、第１処理ステップにおけるＣＮＮの繰り返し学習の回数と、ＣＮＮから出力される画像と教師画像との間の差を表す損失関数値と、の関係を示すグラフである。FIG. 10 is a graph showing the relationship between the number of iterative learnings of the CNN in the first processing step and the loss function value representing the difference between the image output from the CNN and the teacher image. 図１１は、図１０の一部を拡大して示すグラフである。FIG. 11 is a graph showing an enlarged part of FIG. 図１２（ａ）は、第１処理ステップにおいて学習回数が３２回のときにＣＮＮから出力される画像を示す図である。図１２（ｂ）は、第１処理ステップにおいて学習回数が６４回のときにＣＮＮから出力される画像を示す図である。FIG. 12(a) is a diagram showing an image output from the CNN when the number of times of learning is 32 in the first processing step. FIG. 12(b) is a diagram showing an image output from the CNN when the number of times of learning is 64 in the first processing step. 図１３（ａ）は、第１処理ステップにおいて学習回数が９６回のときにＣＮＮから出力される画像を示す図である。図１３（ｂ）は、第１処理ステップにおいて学習回数が１２８回のときにＣＮＮから出力される画像を示す図である。FIG. 13(a) is a diagram showing an image output from the CNN when the number of times of learning is 96 in the first processing step. FIG. 13(b) is a diagram showing an image output from the CNN when the number of times of learning is 128 in the first processing step. 図１４は、対象画像Ａを示す図である。14 is a diagram showing the target image A. FIG. 図１５（ａ）は、実施例により得られた出力画像Ｄを示す図である。図１５（ｂ）は、比較例１により得られた画像を示す図である。FIG. 15(a) is a diagram showing an output image D obtained by the example. 15B is a diagram showing an image obtained by Comparative Example 1. FIG. 図１６（ａ）は、注目領域Ｒ１において元画像に付加した固定ノイズの分布を示す図である。図１６（ｂ）は、実施例により得られた出力画像Ｄから対象画像Ａを差し引いて得られた値の分布を示す図である。図１６（ｃ）は、比較例１により得られた画像から対象画像Ａを差し引いて得られた値の分布を示す図である。FIG. 16A is a diagram showing the distribution of fixed noise added to the original image in the region of interest R1. FIG. 16(b) is a diagram showing the distribution of values obtained by subtracting the target image A from the output image D obtained in the example. 16C is a diagram showing the distribution of values obtained by subtracting the target image A from the image obtained in Comparative Example 1. FIG. 図１７（ａ）は、注目領域Ｒ２において元画像に付加した固定ノイズの分布を示す図である。図１７（ｂ）は、実施例により得られた出力画像Ｄから対象画像Ａを差し引いて得られた値の分布を示す図である。図１７（ｃ）は、比較例１により得られた画像から対象画像Ａを差し引いて得られた値の分布を示す図である。FIG. 17A is a diagram showing the distribution of fixed noise added to the original image in the region of interest R2. FIG. 17(b) is a diagram showing the distribution of values obtained by subtracting the target image A from the output image D obtained in the example. 17C is a diagram showing the distribution of values obtained by subtracting the target image A from the image obtained in Comparative Example 1. FIG. 図１８（ａ）は、注目領域Ｒ１における対象画像Ａを示す図である。図１８（ｂ）は、実施例により得られた出力画像Ｄを示す図である。図１８（ｃ）は、比較例１により得られた画像を示す図である。FIG. 18(a) is a diagram showing the target image A in the attention area R1. FIG. 18(b) is a diagram showing an output image D obtained by the example. 18C is a diagram showing an image obtained by Comparative Example 1. FIG. 図１９（ａ）は、注目領域Ｒ２における対象画像Ａを示す図である。図１９（ｂ）は、実施例により得られた出力画像Ｄを示す図である。図１９（ｃ）は、比較例１により得られた画像を示す図である。FIG. 19A is a diagram showing the target image A in the attention area R2. FIG. 19(b) is a diagram showing an output image D obtained by the example. FIG. 19C is a diagram showing an image obtained by Comparative Example 1. FIG. 図２０は、元画像を示す図である。FIG. 20 is a diagram showing an original image. 図２１（ａ）は元画像を示す図である。図２１（ｂ）は対象画像Ａを示す図である。FIG. 21(a) is a diagram showing an original image. FIG. 21(b) is a diagram showing the target image A. FIG. 図２２（ａ）は実施例により得られた出力画像Ｄを示す図である。図２２（ｂ）は比較例１により得られた画像を示す図である。FIG. 22(a) is a diagram showing an output image D obtained by the example. 22(b) is a diagram showing an image obtained by Comparative Example 1. FIG. 図２３（ａ）は比較例２により得られた画像を示す図である。図２３（ｂ）は比較例３により得られた画像を示す図である。FIG. 23(a) is a diagram showing an image obtained in Comparative Example 2. FIG. FIG. 23(b) is a diagram showing an image obtained in Comparative Example 3. FIG. 図２４（ａ）は元画像を示す図である。図２４（ｂ）は対象画像Ａを示す図である。FIG. 24(a) is a diagram showing an original image. FIG. 24(b) is a diagram showing the target image A. FIG. 図２５（ａ）は実施例により得られた出力画像Ｄを示す図である。図２５（ｂ）は比較例１により得られた画像を示す図である。FIG. 25(a) is a diagram showing an output image D obtained by the example. FIG. 25(b) is a diagram showing an image obtained by Comparative Example 1. FIG. 図２６（ａ）は比較例２により得られた画像を示す図である。図２６（ｂ）は比較例３により得られた画像を示す図である。FIG. 26(a) is a diagram showing an image obtained in Comparative Example 2. FIG. FIG. 26(b) is a diagram showing an image obtained in Comparative Example 3. FIG. 図２７は、元画像を示す図である。図２７（ｂ）は、図２７（ａ）の一部を拡大して示す図である。FIG. 27 is a diagram showing an original image. FIG.27(b) is a figure which expands and shows a part of Fig.27 (a). 図２８は、対象画像Ａを示す図である。図２８（ｂ）は、図２８（ａ）の一部を拡大して示す図である。28 is a diagram showing a target image A. FIG. FIG.28(b) is a figure which expands and shows a part of Fig.28 (a). 図２９は、実施例により得られた出力画像Ｄを示す図である。図２９（ｂ）は、図２９（ａ）の一部を拡大して示す図である。FIG. 29 is a diagram showing an output image D obtained by the example. FIG.29(b) is a figure which expands and shows a part of FIG.29(a). 図３０は、比較例１により得られた画像を示す図である。図３０（ｂ）は、図３０（ａ）の一部を拡大して示す図である。30 is a diagram showing an image obtained by Comparative Example 1. FIG. FIG.30(b) is a figure which expands and shows a part of Fig.30 (a). 図３１は、比較例３により得られた画像を示す図である。図３１（ｂ）は、図３１（ａ）の一部を拡大して示す図である。31 is a diagram showing an image obtained by Comparative Example 3. FIG. FIG.31(b) is a figure which expands and shows a part of Fig.31 (a). 図３２（ａ）は元画像を示す図である。図３２（ｂ）は対象画像Ａを示す図である。FIG. 32(a) is a diagram showing an original image. FIG. 32(b) is a diagram showing the target image A. FIG. 図３３（ａ）は実施例により得られた出力画像Ｄを示す図である。図３３（ｂ）は比較例１により得られた画像を示す図である。図３３（ｃ）は比較例３により得られた画像を示す図である。FIG. 33(a) is a diagram showing an output image D obtained by the example. 33(b) is a diagram showing an image obtained by Comparative Example 1. FIG. FIG. 33(c) is a diagram showing an image obtained by Comparative Example 3. FIG.

以下、添付図面を参照して、本発明を実施するための形態を詳細に説明する。なお、図面の説明において同一の要素には同一の符号を付し、重複する説明を省略する。本発明は、これらの例示に限定されるものではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the accompanying drawings. In the description of the drawings, the same elements are denoted by the same reference numerals, and overlapping descriptions are omitted. The present invention is not limited to these exemplifications, but is indicated by the scope of the claims, and is intended to include all modifications within the meaning and scope of equivalents of the scope of the claims.

図１は、本実施形態の画像処理装置１の構成を示す図である。画像処理装置１は、対象画像Ａに重畳されているノイズを低減して出力画像Ｄを作成する装置であって、第１処理部１０および第２処理部２０を備える。第１処理部１０は、複数（Ｎ個）のランダムノイズ画像Ｂ_１～Ｂ_Ｎそれぞれについて、該ランダムノイズ画像Ｂ_ｎを入力画像とし対象画像Ａを教師画像として畳み込みニューラルネットワーク（ＣＮＮ）を繰り返し学習させ、その繰り返し学習の後に該ＣＮＮから出力される画像を中間画像Ｃ_ｎとして取得する。第２処理部２０は、第１処理部１０により取得された複数（Ｎ個）の中間画像Ｃ_１～Ｃ_Ｎに基づいて、対象画像Ａからノイズが低減された出力画像Ｄを作成する。 FIG. 1 is a diagram showing the configuration of an image processing apparatus 1 of this embodiment. The image processing apparatus 1 is an apparatus for reducing noise superimposed on a target image A to create an output image D, and includes a first processing section 10 and a second processing section 20 . The first processing unit 10 repeatedly learns a convolutional neural network (CNN) using the random noise image B _n as an input image and the target image A as a teacher image for each of a plurality (N) of random noise images B ₁ to B _N. , and an image output from the CNN after the iterative learning is obtained as an intermediate image _Cn . The second processing unit 20 creates an output image D in which noise is reduced from the target image A based on the plurality (N) of intermediate images C ₁ to C _N acquired by the first processing unit 10 .

Ｎ個のランダムノイズ画像Ｂ_１～Ｂ_ＮおよびＮ個の中間画像Ｃ_１～Ｃ_Ｎは、対象画像Ａと同一サイズのものである。ランダムノイズ画像Ｂ_１～Ｂ_Ｎは、何らかの意味のある構造を有する必要はなく、ノイズのみからなる画像であってよい。Ｎは、好適には１６以上の整数であり、更に好適には５０以上の整数である。ｎは１以上Ｎ以下の各整数である。Ｎの値が大きいほど、ノイズ低減効果が大きくなることが期待されるが、一方で、処理時間が長くなり、ノイズ低減効果が飽和してくるので、Ｎの上限値は例えば２００程度であるのが好適である。第１処理部１０で用いるＣＮＮは、エンコーダ・デコーダ形式の特徴を有しない形式（非エンコーダ・デコーダ形式）のものであるのが好適であり、例えば、ＲｅｓＮｅｔ形式のものが好適である。 The N random noise images B ₁ to B _N and the N intermediate images C ₁ to C _N are of the same size as the target image A. The random noise images B ₁ -B _N need not have any meaningful structure and may be images consisting only of noise. N is preferably an integer of 16 or more, and more preferably an integer of 50 or more. n is each integer of 1 or more and N or less. It is expected that the larger the value of N, the greater the noise reduction effect. is preferred. The CNN used in the first processing unit 10 is preferably of a format (non-encoder/decoder format) that does not have the characteristics of the encoder/decoder format, and is preferably of the ResNet format, for example.

画像処理装置１は、コンピュータを含む構成とすることができる。画像処理装置１は、各種の処理を行うＣＰＵやＧＰＵ等を含む演算部と、画像等を表示する液晶ディスプレイ等を含む表示部と、処理条件等の入力を受け付けるキーボードやマウス等を含む入力部と、画像データ等を記憶するハードディスクドライブ、ＲＡＭおよびＲＯＭ等を含む記憶部とを備える。 The image processing apparatus 1 can be configured to include a computer. The image processing apparatus 1 includes an arithmetic unit including a CPU, a GPU, etc. for performing various processes, a display unit including a liquid crystal display for displaying images, etc., and an input unit including a keyboard, mouse, etc. for receiving input of processing conditions and the like. and a storage unit including a hard disk drive for storing image data and the like, a RAM and a ROM.

また、画像処理装置１の記憶部は、次に説明する本実施形態の画像処理方法の第１処理ステップＳ１および第２処理ステップＳ２をコンピュータに実行させるための画像処理プログラムを記憶する。演算部は、その画像処理プログラムに基づいて第１処理ステップＳ１および第２処理ステップＳ２を実行する。画像処理プログラムは、画像処理装置１の出荷時に記憶部に記憶されていてもよいし、出荷後に通信回線を経由して取得されたものが記憶部に記憶されてもよいし、コンピュータ読み取り可能な記録媒体３０に記録されていたものが記憶部に記憶されてもよい。記録媒体３０は、フレキシブルディスク、ＣＤ-ＲＯＭ、ＤＶＤ-ＲＯＭ、ＢＤ-ＲＯＭ、ＵＳＢメモリなど任意である。 The storage unit of the image processing apparatus 1 also stores an image processing program for causing a computer to execute the first processing step S1 and the second processing step S2 of the image processing method of this embodiment, which will be described below. The calculation unit executes the first processing step S1 and the second processing step S2 based on the image processing program. The image processing program may be stored in the storage unit when the image processing apparatus 1 is shipped, may be acquired via a communication line after shipment, and may be stored in the storage unit. What has been recorded on the recording medium 30 may be stored in the storage unit. The recording medium 30 is arbitrary, such as a flexible disk, CD-ROM, DVD-ROM, BD-ROM, and USB memory.

図２は、本実施形態の画像処理方法のフローチャートである。この画像処理方法は、対象画像Ａに重畳されているノイズを低減して出力画像Ｄを作成する方法であって、第１処理ステップＳ１および第２処理ステップＳ２を備える。第１処理ステップＳ１は第１処理部１０が行う処理であり、第２処理ステップＳ２は第２処理部２０が行う処理である。第１処理ステップＳ１では、複数（Ｎ個）のランダムノイズ画像Ｂ_１～Ｂ_Ｎそれぞれについて、該ランダムノイズ画像Ｂ_ｎを入力画像とし対象画像Ａを教師画像としてＣＮＮを繰り返し学習させ、その繰り返し学習の後に該ＣＮＮから出力される画像を中間画像Ｃ_ｎとして取得する。第２処理ステップＳ２では、第１処理ステップＳ１において取得された複数（Ｎ個）の中間画像Ｃ_１～Ｃ_Ｎに基づいて、対象画像Ａからノイズが低減された出力画像Ｄを作成する。 FIG. 2 is a flow chart of the image processing method of this embodiment. This image processing method is a method of reducing noise superimposed on the target image A to create an output image D, and includes a first processing step S1 and a second processing step S2. The first processing step S1 is performed by the first processing unit 10, and the second processing step S2 is performed by the second processing unit 20. FIG. In the first processing step S1, for each of a plurality (N) of random noise images B ₁ to B _N , the random noise image B _n is used as an input image, and the target image A is used as a teacher image. An image output from the CNN after is acquired as an intermediate image _Cn . In the second processing step S2, a noise-reduced output image D is created from the target image A based on the plurality (N) of intermediate images C ₁ to C _N acquired in the first processing step S1.

図３は、第１処理部１０の構成を示す図である。第１処理部１０は、教師画像（対象画像Ａ）および入力画像（ランダムノイズ画像Ｂ_ｎ）に基づいてＣＮＮを学習させることで中間画像Ｃ_ｎを作成するものであって、第１ＣＮＮ部１１および第１評価部１２を含む。 FIG. 3 is a diagram showing the configuration of the first processing unit 10. As shown in FIG. The first processing unit 10 creates an intermediate image C _n by learning a CNN based on a teacher image (target image A) and an input image (random noise image B _n ). A first evaluation unit 12 is included.

第１ＣＮＮ部１１は、入力画像（ランダムノイズ画像Ｂ_ｎ）をＣＮＮに入力させ、その入力時点でのＣＮＮの学習状態に応じた画像をＣＮＮから出力させる。第１評価部１２は、ＣＮＮから出力された画像と教師画像（対象画像Ａ）との間の差を表す損失関数値を求める。損失関数は例えばＬ２ノルムである。第１ＣＮＮ部１１は、この損失関数値が小さくなるようにＣＮＮを学習させる。このように、第１処理部１０は、教師画像（対象画像Ａ）および入力画像（ランダムノイズ画像Ｂ_ｎ）の組合せを用いて、第１ＣＮＮ部１１および第１評価部１２によりＣＮＮを繰り返し学習させる。 The first CNN unit 11 inputs an input image (random noise image B _n ) to the CNN, and outputs an image corresponding to the learning state of the CNN at the time of input from the CNN. The first evaluation unit 12 obtains a loss function value representing the difference between the image output from the CNN and the teacher image (target image A). A loss function is, for example, the L2 norm. The first CNN unit 11 learns the CNN so that this loss function value becomes small. In this way, the first processing unit 10 repeatedly learns the CNN by the first CNN unit 11 and the first evaluation unit 12 using a combination of the teacher image (target image A) and the input image (random noise image B _n ). .

ＣＮＮの繰り返し学習の過程において、ＣＮＮから出力される画像は、初めのうちは入力画像（ランダムノイズ画像Ｂ_ｎ）に近いものであるが、次第に教師画像（対象画像Ａ）において当初の固定ノイズが入力画像（ランダムノイズ画像Ｂ_ｎ）に置き換えられていき、やがて教師画像（対象画像Ａ）に近いものとなっていく。これは、意味のある構造の方がランダムノイズより早く学習される（すなわち、ランダムノイズは学習されにくい）というＣＮＮの性質に基づく。 In the process of iterative learning of the CNN, the image output from the CNN is close to the input image (random noise image B _n ) at first, but gradually the original fixed noise in the teacher image (target image A) It is replaced by the input image (random noise image B _n ), and eventually becomes closer to the teacher image (target image A). This is based on the property of CNNs that meaningful structures are learned faster than random noise (that is, random noise is less learned).

そこで、ＣＮＮの繰り返し学習の途中で該学習を終了させることで、教師画像（対象画像Ａ）において当初の固定ノイズが入力画像（ランダムノイズ画像Ｂ_ｎ）に置き換えられたものを、ＣＮＮから出力させることができる。学習終了時にＣＮＮから出力される画像を中間画像Ｃ_ｎとして取得する。ＣＮＮの繰り返し学習は、学習回数が目標値に達したときに終了してもよいが、ＣＮＮから出力された画像と教師画像（対象画像Ａ）との間の差を表す損失関数値が目標範囲内になったときに終了するのが好適である。この学習終了の判定に用いる損失関数値の目標範囲は、当初の入力画像（ランダムノイズ画像Ｂ_ｎ）と教師画像（対象画像Ａ）との間の差を表す損失関数値に基づいて決定されるのが好適である。 Therefore, by ending the learning in the middle of the repeated learning of the CNN, the original fixed noise in the teacher image (target image A) is replaced with the input image (random noise image B _n ). Output from the CNN be able to. An image output from the CNN at the end of learning is acquired as an intermediate image _Cn . The iterative learning of the CNN may end when the number of times of learning reaches the target value, but the loss function value representing the difference between the image output from the CNN and the teacher image (target image A) is within the target range. It is preferable to terminate when the The target range of loss function values used to determine the end of learning is determined based on the loss function value representing the difference between the original input image (random noise image B _n ) and the teacher image (target image A). is preferred.

Ｎ個のランダムノイズ画像Ｂ_１～Ｂ_Ｎそれぞれを入力画像として用いて同様の処理を行って、Ｎ個の中間画像Ｃ_１～Ｃ_Ｎを取得する。このようにして取得されたＮ個の中間画像Ｃ_１～Ｃ_Ｎの共通の画素について見ると、その画素の輝度値は、対象画像Ａの同じ画素の本来の輝度値（固定ノイズを除いた輝度値）を中心にして略対称な分布（正規分布）を有する。この輝度値の分布は、Ｎ個のランダムノイズ画像Ｂ_１～Ｂ_Ｎが有していたノイズの分布に応じたものである。 Similar processing is performed using each of the N random noise images B ₁ to _BN as input images to acquire N intermediate images C ₁ to C _N . Looking at the common pixels of the N intermediate images C ₁ to C _N acquired in this way, the luminance value of the pixel is the original luminance value of the same pixel of the target image A (luminance excluding fixed noise value) and has a substantially symmetrical distribution (normal distribution). This luminance value distribution corresponds to the noise distribution of the N random noise images B ₁ to _BN .

なお、第１処理部１０は、Ｎ個のランダムノイズ画像Ｂ_１～Ｂ_ＮからＮ個の中間画像Ｃ_１～Ｃ_Ｎを取得する処理のうちの何れか２以上の処理を並列的に行ってもよい。例えば、第１処理部１０が並列動作可能なＮ個の部分処理部を含む構成とし、そのうちの第ｎ部分処理部が対象画像Ａおよび第ｎランダムノイズ画像Ｂ_ｎからＣＮＮにより第ｎ中間画像Ｃ_ｎを作成することにすれば、処理時間を１／Ｎに短縮することができる。 Note that the first processing unit 10 performs any two or more of the processes of acquiring N intermediate images C ₁ to C _N from N random noise images B ₁ to B _N in parallel. good too. For example, the first processing unit 10 is configured to include N partial processing units that can operate in parallel, and the n-th partial processing unit among them is the target image A and the n-th random noise image B _n to the n-th intermediate image C by CNN. By creating _n , the processing time can be reduced to 1/N.

図４は、第２処理部２０の構成を示す図である。第２処理部２０は、Ｎ個の中間画像Ｃ_１～Ｃ_Ｎを用いてNoise2Noise方式（非特許文献４）によりＣＮＮを学習させることで出力画像Ｄを作成するものであって、第２ＣＮＮ部２１、第２評価部２２および画像選択部２３を含む。 FIG. 4 is a diagram showing the configuration of the second processing section 20. As shown in FIG. The second processing unit 20 creates an output image D by learning the CNN by the Noise2Noise method (Non-Patent Document 4) using the N intermediate images C ₁ to C _N . , a second evaluator 22 and an image selector 23 .

画像選択部２３は、Ｎ個の中間画像Ｃ_１～Ｃ_Ｎのうちの何れかの中間画像Ｃ_ｎ１を選択してＣＮＮの入力画像とし、他の何れかの中間画像Ｃ_ｎ２を選択して該ＣＮＮの教師画像とする。入力画像（中間画像Ｃ_ｎ１）および教師画像（中間画像Ｃ_ｎ２）の組合せはＮ（Ｎ－１）とおり可能であるが、Ｎ（Ｎ－１）とおりの全ての組合せを採用しなくてもよい。 The image selection unit 23 selects any intermediate image C _n1 from among the N intermediate images C ₁ to C _N as an input image for the CNN, and selects any other intermediate image C _n2 as the input image for the CNN. Let it be a CNN teacher image. There are N(N−1) possible combinations of the input image (intermediate image C _n1 ) and the teacher image (intermediate image C _n2 ), but it is not necessary to employ all N(N−1) combinations. .

第２ＣＮＮ部２１は、入力画像（中間画像Ｃ_ｎ１）をＣＮＮに入力させ、その入力時点でのＣＮＮの学習状態に応じた画像をＣＮＮから出力させる。第２評価部２２は、ＣＮＮから出力された画像と教師画像（中間画像Ｃ_ｎ２）との間の差を表す損失関数値を求める。損失関数は例えばＬ２ノルムである。第２ＣＮＮ部２１は、この損失関数値が小さくなるようにＣＮＮを学習させる。このように、第２処理部２０は、画像選択部２３により選択された教師画像（中間画像Ｃ_ｎ２）および入力画像（中間画像Ｃ_ｎ１）の様々な組合せを用いて、第２ＣＮＮ部２１および第２評価部２２によりＣＮＮを繰り返し学習させる。 The second CNN unit 21 inputs the input image (intermediate image C _n1 ) to the CNN, and outputs from the CNN an image corresponding to the learning state of the CNN at the time of the input. The second evaluation unit 22 obtains a loss function value representing the difference between the image output from the CNN and the teacher image (intermediate image C _n2 ). A loss function is, for example, the L2 norm. The second CNN unit 21 learns the CNN so that this loss function value becomes small. In this way, the second processing unit 20 uses various combinations of the teacher image (intermediate image C _n2 ) and the input image (intermediate image C _n1 ) selected by the image selection unit 23 to select the second CNN unit 21 and the 2 The evaluation unit 22 repeatedly learns the CNN.

互いにノイズパターンが異なる教師画像（中間画像Ｃ_ｎ２）および入力画像（中間画像Ｃ_ｎ１）の様々な組合せを用いてＣＮＮを繰り返し学習させることで、ＣＮＮから出力される画像は、対象画像Ａからノイズが低減された出力画像Ｄとなる。 By repeatedly learning the CNN using various combinations of teacher images (intermediate images C _n2 ) and input images (intermediate images C _n1 ) having different noise patterns, the image output from the CNN is the noise from the target image A. is reduced as an output image D.

第２処理部２０は、Ｎ個の中間画像Ｃ_１～Ｃ_Ｎを用いてNoise2Noise方式によりＣＮＮを学習させることで出力画像Ｄを作成するのが好ましいが、他の手法によりＮ個の中間画像Ｃ_１～Ｃ_Ｎから出力画像Ｄを作成することもできる。例えば、Ｎ個の中間画像Ｃ_１～Ｃ_Ｎの平均をとることで、対象画像Ａからノイズが低減された出力画像Ｄを作成することができる。 The second processing unit 20 preferably creates the output image D by learning the CNN by the Noise2Noise method using the N intermediate images C ₁ to C _N , but the N intermediate images C ₁ to C _N can also be used to create an output image D. For example, by averaging N intermediate images C ₁ to C _N , an output image D in which noise is reduced from the target image A can be created.

本実施形態によれば、対象画像Ａが一つしかない場合や対象画像ＡのＳＮ比が低い場合であっても、該対象画像Ａのノイズを効果的に低減することができる。以下では、シミュレーション結果を示すことにより本実施形態のノイズ低減効果について説明する。 According to this embodiment, even when there is only one target image A or when the SN ratio of the target image A is low, the noise of the target image A can be effectively reduced. Below, the noise reduction effect of this embodiment will be described by showing simulation results.

次にシミュレーション結果について説明する。以下では、第２処理ステップにおいてNoise2Noise方式によるＣＮＮの学習により出力画像Ｄを作成した。 Next, simulation results will be described. In the following, the output image D is created by CNN learning by the Noise2Noise method in the second processing step.

図５（ａ）は、シミュレーションで用いた元画像を示す図である。図５（ｂ）は、シミュレーションで用いた対象画像Ａを示す図である。対象画像Ａ（図５（ｂ））は、標準偏差が５の正規分布を有する固定ノイズを元画像（図５（ａ））に重畳して作成した。図６（ａ）は、シミュレーションで用いたランダムノイズ画像Ｂ_ｎの一例を示す図である。図６（ｂ）は、シミュレーションで作成した中間画像Ｃ_ｎの一例を示す図である。第１処理ステップでは、Ｎ＝１００として、ランダムノイズ画像Ｂ_１～Ｂ_１００および対象画像Ａから中間画像Ｃ_１～Ｃ_１００を作成した。 FIG. 5A is a diagram showing the original image used in the simulation. FIG. 5B is a diagram showing the target image A used in the simulation. The target image A (FIG. 5(b)) was created by superimposing fixed noise having a normal distribution with a standard deviation of 5 on the original image (FIG. 5(a)). FIG. 6(a) is a diagram showing an example of the random noise image _Bn used in the simulation. FIG. 6B is a diagram showing an example of an intermediate image _Cn created by simulation. In the first processing step, intermediate images C _{1 to C 100 were created from the random noise images B 1} _to B ₁₀₀ and the target image A with N= ₁₀₀ .

図７は、中間画像Ｃ_１～Ｃ_１００のうちからランダムに選択した２個の中間画像の差の画像における輝度値の分布を示すヒストグラムである。横軸は輝度値であり、縦軸は頻度である。この図に示されるように、中間画像Ｃ_ｎの各画素の輝度値は略対称な分布（正規分布）を有する。この分布の標準偏差は１４.２５６であった。 FIG. 7 is a histogram showing the distribution of luminance values in difference images between two intermediate images randomly selected from the intermediate images C ₁ to C ₁₀₀ . The horizontal axis is the luminance value, and the vertical axis is the frequency. As shown in this figure, the luminance values of each pixel of the intermediate image _Cn have a substantially symmetrical distribution (normal distribution). The standard deviation of this distribution was 14.256.

各画像の或る共通の領域（画像中の帽子の上の領域）について各画素の輝度値に含まれるノイズの分布の標準偏差を求めると、対象画像Ａでは５.８９４であり、中間画像Ｃ_１～Ｃ_１００のうちの１個の中間画像では１１.１３７であり、中間画像Ｃ_１～Ｃ_１００の平均画像では５.６７５であり、Noise2Noise方式によるＣＮＮの学習により得られた出力画像Ｄでは２.８９３であった。対象画像Ａのノイズ分布の標準偏差と比べると、各中間画像Ｃ_ｎのノイズ分布の標準偏差は大きくなるものの、中間画像Ｃ_１～Ｃ_１００の平均画像のノイズ分布の標準偏差は小さくなり、出力画像Ｄのノイズ分布の標準偏差は更に小さくなった。 The standard deviation of the distribution of noise contained in the luminance value of each pixel in a certain common area of each image (the area above the hat in the image) is 5.894 for the target image A and 5.894 for the intermediate image C. 11.137 for one intermediate image out of C ₁ to C ₁₀₀ , 5.675 for the average image of intermediate images C ₁ to C ₁₀₀ , and 5.675 for the output image D obtained by CNN learning by the Noise2Noise method. was 2.893. Compared to the standard deviation of the noise distribution of the target image A, although the standard deviation of the noise distribution of each intermediate image C _n is larger, the standard deviation of the noise distribution of the average image of the intermediate images C ₁ to C ₁₀₀ is smaller, and the output The standard deviation of the noise distribution for image D was even smaller.

図８～図１１は、第１処理ステップにおけるＣＮＮの学習回数と、ＣＮＮから出力される画像と教師画像との間の差を表す損失関数値と、の関係を示すグラフである。図８の一部を拡大したものが図９である。図１０の一部を拡大したものが図１１である。横軸はＣＮＮの学習回数であり、縦軸は損失関数値である。 8 to 11 are graphs showing the relationship between the number of CNN learning times in the first processing step and the loss function value representing the difference between the image output from the CNN and the teacher image. FIG. 9 is an enlarged view of a part of FIG. FIG. 11 is an enlarged view of a part of FIG. The horizontal axis is the number of CNN learning times, and the vertical axis is the loss function value.

図８および図９は、対象画像Ａ（図５（ｂ））を用意する際に元画像（図５（ａ））に付加したランダムノイズパターン画像（固定ノイズ画像）を教師画像とした場合を示す。これらの図に示されるように、第１処理ステップにおいてＣＮＮの学習回数が多くなるにしたがって損失関数値が小さくなっていく傾向が認められる。学習を繰り返すことにより、ランダムノイズ画像から固定ノイズ画像が学習され、損失関数値が小さくなっていく。ＣＮＮから出力される画像は最終的には固定ノイズ画像に近いものとなる。第１処理ステップでは、固定ノイズ画像まで学習する前に、ＣＮＮから出力される画像がランダムノイズ画像を含む時点で学習を終了する。図８および図９の例では、６０～８０回の学習で終了するのが好適である。なお、前述したとおり、ＣＮＮの繰り返し学習は、学習回数が目標値に達したときに終了してもよいが、ＣＮＮから出力された画像と教師画像との間の差を表す損失関数値が目標範囲内になったときに終了するのが好適である。 8 and 9 show the case where the random noise pattern image (fixed noise image) added to the original image (FIG. 5(a)) when preparing the target image A (FIG. 5(b)) is used as the teacher image. show. As shown in these figures, there is a tendency for the loss function value to decrease as the number of times of CNN learning increases in the first processing step. By repeating learning, the fixed noise image is learned from the random noise image, and the loss function value becomes smaller. The final image output from the CNN is close to a fixed noise image. In the first processing step, learning is terminated when an image output from the CNN contains a random noise image before learning up to a fixed noise image. In the examples of FIGS. 8 and 9, it is preferable to finish learning after 60 to 80 times. As described above, the iterative learning of the CNN may end when the number of times of learning reaches the target value. It is preferred to terminate when in range.

図１０および図１１は、元画像（図５（ａ））を教師画像とした場合（SignalOnly）の場合、および、固定ノイズが重畳された対象画像Ａ（図５（ｂ））を教師画像とした場合（Signal＋Noise）を示す。これらの図に示されるように、元画像を教師画像とした場合と比べて、固定ノイズが重畳された対象画像Ａを教師画像とした場合の方が、学習が遅い。これは、Deep Image Prior技術（非特許文献１～３）と同様に、教師画像にノイズが重畳されていることにより学習が遅くなることを示している。そこで、ＣＮＮの学習を適切な回数で終了することにより、ＣＮＮから出力される画像は、固定ノイズの学習することなく、本来の画像（元画像）にランダムノイズが重畳されたものとすることができる。 10 and 11 show the case where the original image (FIG. 5(a)) is used as the teacher image (SignalOnly), and the target image A (FIG. 5(b)) superimposed with fixed noise is used as the teacher image. (Signal + Noise). As shown in these figures, learning is slower when the target image A superimposed with fixed noise is used as the teacher image than when the original image is used as the teacher image. This indicates that, like the Deep Image Prior technology (Non-Patent Documents 1 to 3), superimposition of noise on the teacher image slows down learning. Therefore, by ending the CNN learning at an appropriate number of times, the image output from the CNN can be assumed to be the original image (original image) superimposed with random noise without learning fixed noise. can.

図１２および図１３は、第１処理ステップにおいてＣＮＮの繰り返し学習の過程でＣＮＮから出力される画像の例を示す図である。ここでは、元画像（図５（ａ））を教師画像とした。図１２（ａ）は、第１処理ステップにおいて学習回数が３２回のときにＣＮＮから出力される画像を示す図である。図１２（ｂ）は、第１処理ステップにおいて学習回数が６４回のときにＣＮＮから出力される画像を示す図である。図１３（ａ）は、第１処理ステップにおいて学習回数が９６回のときにＣＮＮから出力される画像を示す図である。図１３（ｂ）は、第１処理ステップにおいて学習回数が１２８回のときにＣＮＮから出力される画像を示す図である。これらの図に示されるように、第１処理ステップにおいてＣＮＮの学習回数が多くなるにしたがって、ＣＮＮから出力される画像は教師画像に近づいていく。 12 and 13 are diagrams showing examples of images output from the CNN in the process of iterative learning of the CNN in the first processing step. Here, the original image (FIG. 5(a)) is used as the teacher image. FIG. 12(a) is a diagram showing an image output from the CNN when the number of times of learning is 32 in the first processing step. FIG. 12(b) is a diagram showing an image output from the CNN when the number of times of learning is 64 in the first processing step. FIG. 13(a) is a diagram showing an image output from the CNN when the number of times of learning is 96 in the first processing step. FIG. 13(b) is a diagram showing an image output from the CNN when the number of times of learning is 128 in the first processing step. As shown in these figures, as the number of times the CNN learns in the first processing step increases, the image output from the CNN approaches the teacher image.

次に、より大きな標準偏差が５０のガウシアン分布を有する固定ノイズを元画像（図５（ａ））に重畳して作成した対象画像Ａを用いて、シミュレーションにより、本実施形態の手法（実施例）と他の手法（比較例）との間でノイズの低減の程度を比較した。図１４は、対象画像Ａを示す図である。固定ノイズは、３２ビット単精度少数に拡張し、０以下または２５５以上のノイズが丸まらないようにして、元画像に付加した。この図には、２つの注目領域Ｒ１，Ｒ２が矩形枠で示されている。図１５（ａ）は、実施例により得られた出力画像Ｄを示す図である。図１５（ｂ）は、比較例１により得られた画像を示す図である。比較例１は、Deep Image Prior技術に基づくものである。 Next, the method of this embodiment (example ) and another method (comparative example) in terms of noise reduction. 14 is a diagram showing the target image A. FIG. Fixed noise was added to the original image by extending it to 32-bit single-precision decimals so that noise below 0 or above 255 is not rounded. In this figure, two regions of interest R1 and R2 are indicated by rectangular frames. FIG. 15(a) is a diagram showing an output image D obtained by the example. 15B is a diagram showing an image obtained by Comparative Example 1. FIG. Comparative Example 1 is based on Deep Image Prior technology.

これらの図に示されるように、比較例１により得られた画像は、一見するとシャープに見えるものの、ノイズに塊のような低周波成分が含まれているように見える。これに対して、実施例により得られた出力画像Ｄは、全体的にノイズが消し切れていないように見えるものの、バランスよくノイズが低減されており、より自然に元画像を再現している。 As shown in these figures, the image obtained in Comparative Example 1 looks sharp at first glance, but the noise appears to contain lumpy low-frequency components. On the other hand, in the output image D obtained by the embodiment, although it seems that the noise is not totally eliminated, the noise is reduced in a well-balanced manner, and the original image is reproduced more naturally.

図１６は、注目領域Ｒ１における各画像と対象画像Ａとの差の画像を拡大して示す図である。図１６（ａ）は、元画像に付加した固定ノイズの分布を示す図である。この画像のノイズ分布の標準偏差は４８.６であった。図１６（ｂ）は、実施例により得られた出力画像Ｄから対象画像Ａを差し引いて得られた値の分布を示す図である。この画像のノイズ分布の標準偏差は１６.２であった。図１６（ｃ）は、比較例１により得られた画像から対象画像Ａを差し引いて得られた値の分布を示す図である。この画像のノイズ分布の標準偏差は１６.３であった。 FIG. 16 is an enlarged view showing a difference image between each image and the target image A in the attention area R1. FIG. 16(a) is a diagram showing the distribution of fixed noise added to the original image. The standard deviation of the noise distribution for this image was 48.6. FIG. 16(b) is a diagram showing the distribution of values obtained by subtracting the target image A from the output image D obtained in the example. The standard deviation of the noise distribution for this image was 16.2. 16C is a diagram showing the distribution of values obtained by subtracting the target image A from the image obtained in Comparative Example 1. FIG. The standard deviation of the noise distribution for this image was 16.3.

図１７は、注目領域Ｒ２における各画像と対象画像Ａとの差の画像を拡大して示す図である。図１７（ａ）は、元画像に付加した固定ノイズの分布を示す図である。この画像のノイズ分布の標準偏差は５０.０であった。図１７（ｂ）は、実施例により得られた出力画像Ｄから対象画像Ａを差し引いて得られた値の分布を示す図である。この画像のノイズ分布の標準偏差は１３.８であった。図１７（ｃ）は、比較例１により得られた画像から対象画像Ａを差し引いて得られた値の分布を示す図である。この画像のノイズ分布の標準偏差は１６.９であった。 FIG. 17 is an enlarged view showing a difference image between each image and the target image A in the attention area R2. FIG. 17A is a diagram showing the distribution of fixed noise added to the original image. The standard deviation of the noise distribution for this image was 50.0. FIG. 17(b) is a diagram showing the distribution of values obtained by subtracting the target image A from the output image D obtained in the example. The standard deviation of the noise distribution for this image was 13.8. 17C is a diagram showing the distribution of values obtained by subtracting the target image A from the image obtained in Comparative Example 1. FIG. The standard deviation of the noise distribution for this image was 16.9.

出力画像と対象画像Ａとの差は、元画像に付加した固定ノイズとなっていることが理想である。しかし、図１６および図１７に示されるように、比較例１では、元画像に付加した固定ノイズより低周波成分のノイズが目立つ。これに対して、実施例では、標準偏差は小さいものの、出力画像と対象画像Ａとの差は固定ノイズと同じ特性を持つ。 Ideally, the difference between the output image and the target image A is fixed noise added to the original image. However, as shown in FIGS. 16 and 17, in Comparative Example 1, low-frequency component noise is more conspicuous than the fixed noise added to the original image. On the other hand, in the embodiment, although the standard deviation is small, the difference between the output image and the target image A has the same characteristics as fixed noise.

図１８は、注目領域Ｒ１における各画像を拡大して示す図である。図１８（ａ）は、対象画像Ａを示す図である。この画像のノイズ分布の標準偏差は４９.８であった。図１８（ｂ）は、実施例により得られた出力画像Ｄを示す図である。この画像のノイズ分布の標準偏差は１８.９であった。図１８（ｃ）は、比較例１により得られた画像を示す図である。この画像のノイズ分布の標準偏差は１９.３であった。 FIG. 18 is an enlarged view of each image in the region of interest R1. FIG. 18A is a diagram showing the target image A. FIG. The standard deviation of the noise distribution for this image was 49.8. FIG. 18(b) is a diagram showing an output image D obtained by the example. The standard deviation of the noise distribution for this image was 18.9. 18C is a diagram showing an image obtained by Comparative Example 1. FIG. The standard deviation of the noise distribution for this image was 19.3.

図１９は、注目領域Ｒ２における各画像を拡大して示す図である。図１９（ａ）は、対象画像Ａを示す図である。この画像のノイズ分布の標準偏差は５０.２であった。図１９（ｂ）は、実施例により得られた出力画像Ｄを示す図である。この画像のノイズ分布の標準偏差は１４.６であった。図１９（ｃ）は、比較例１により得られた画像を示す図である。この画像のノイズ分布の標準偏差は１７.９であった。 FIG. 19 is an enlarged view of each image in the attention area R2. FIG. 19A is a diagram showing the target image A. FIG. The standard deviation of the noise distribution for this image was 50.2. FIG. 19(b) is a diagram showing an output image D obtained by the example. The standard deviation of the noise distribution for this image was 14.6. FIG. 19C is a diagram showing an image obtained by Comparative Example 1. FIG. The standard deviation of the noise distribution for this image was 17.9.

図１８および図１９に示されるように、比較例１では、対象画像Ａと比べると、元画像および固定ノイズ画像の何れにもない低周波成分のノイズが目立つ。これに対して、実施例では、対象画像Ａと同様なパターンでノイズが低減されているように見える。 As shown in FIGS. 18 and 19, in Comparative Example 1, compared to the target image A, low-frequency component noise that is not present in either the original image or the fixed noise image is conspicuous. On the other hand, in the embodiment, it seems that the noise is reduced in a pattern similar to that of the target image A.

次に、シミュレーションにより実施例および比較例１～３の間でノイズ低減の程度を比較した。比較例１は、Deep Image Prior技術に基づくものである。比較例２は、Non-LocalMeans（テンプレートサイズ３×３）に基づくものである。比較例３は、Non-Local Means（テンプレートサイズ５×５）に基づくものである。ここで用いた元画像は図５（ａ）に示されたものであり、対象画像Ａは図５（ｂ）に示されたものである。図２０は、元画像を示す図であり、２つの注目領域Ｒ３，Ｒ４を矩形枠で示している。 Next, the degree of noise reduction was compared between the example and comparative examples 1 to 3 by simulation. Comparative Example 1 is based on Deep Image Prior technology. Comparative Example 2 is based on Non-LocalMeans (template size 3×3). Comparative Example 3 is based on Non-Local Means (template size 5×5). The original image used here is shown in FIG. 5(a), and the target image A is shown in FIG. 5(b). FIG. 20 is a diagram showing an original image, in which two regions of interest R3 and R4 are indicated by rectangular frames.

図２１～図２３は、注目領域Ｒ３の各画像を拡大して示す図である。図２１（ａ）は元画像を示す図である。図２１（ｂ）は対象画像Ａを示す図である。図２２（ａ）は実施例により得られた出力画像Ｄを示す図である。図２２（ｂ）は比較例１により得られた画像を示す図である。図２３（ａ）は比較例２により得られた画像を示す図である。図２３（ｂ）は比較例３により得られた画像を示す図である。図２１～図２３に示されるように、実施例および比較例１～３のうち、実施例で得られた出力画像は、最も元画像に近く、毛髪の乱れた薄い部分の再現性が良い。比較例１～３で得られた画像は、毛髪の薄い部分が途切れ、直線的な毛髪の再現ができていない。 21 to 23 are diagrams showing enlarged images of the attention area R3. FIG. 21(a) is a diagram showing an original image. FIG. 21(b) is a diagram showing the target image A. FIG. FIG. 22(a) is a diagram showing an output image D obtained by the example. 22(b) is a diagram showing an image obtained by Comparative Example 1. FIG. FIG. 23(a) is a diagram showing an image obtained in Comparative Example 2. FIG. FIG. 23(b) is a diagram showing an image obtained in Comparative Example 3. FIG. As shown in FIGS. 21 to 23, among the example and comparative examples 1 to 3, the output image obtained in the example is the closest to the original image, and the reproducibility of the thin and disheveled hair is good. In the images obtained in Comparative Examples 1 to 3, thin hair portions are cut off, and straight hair cannot be reproduced.

図２４～図２６は、注目領域Ｒ４の各画像を拡大して示す図である。図２４（ａ）は元画像を示す図である。図２４（ｂ）は対象画像Ａを示す図である。図２５（ａ）は実施例により得られた出力画像Ｄを示す図である。図２５（ｂ）は比較例１により得られた画像を示す図である。図２６（ａ）は比較例２により得られた画像を示す図である。図２６（ｂ）は比較例３により得られた画像を示す図である。図２４～図２６に示されるように、実施例および比較例１～３のうち、実施例で得られた出力画像は、元画像に対して最も忠実に帽子リボン部分の布のテクスチャーであろう縦縞を再現しており、かつ、布の横縞も再現している。比較例１で得られた画像は、縦縞がほぼ消え、布の横縞の再現性もよくない。 24 to 26 are enlarged views showing respective images of the attention area R4. FIG. 24(a) is a diagram showing an original image. FIG. 24(b) is a diagram showing the target image A. FIG. FIG. 25(a) is a diagram showing an output image D obtained by the example. FIG. 25(b) is a diagram showing an image obtained by Comparative Example 1. FIG. FIG. 26(a) is a diagram showing an image obtained in Comparative Example 2. FIG. FIG. 26(b) is a diagram showing an image obtained in Comparative Example 3. FIG. As shown in FIGS. 24 to 26, among the example and comparative examples 1 to 3, the output image obtained in the example is probably the texture of the cloth of the hat ribbon portion that is most faithful to the original image. The vertical stripes are reproduced, and the horizontal stripes of the cloth are also reproduced. The image obtained in Comparative Example 1 has almost no vertical stripes, and the reproducibility of horizontal stripes on the cloth is poor.

次に、他の対象画像を用いてシミュレーションにより実施例および比較例１，３の間でノイズ低減の程度を比較した。比較例１は、Deep Image Prior技術に基づくものである。比較例３は、Non-LocalMeans（テンプレートサイズ５×５）に基づくものである。対象画像および元画像として用いた画像は、浜松ホトニクス株式会社製のデジタルＣＭＯＳカメラ（ＯＲＣＡ－Ｆｕｓｉｏｎ）による撮像により取得された蛍光画像である。 Next, the degree of noise reduction was compared between the example and comparative examples 1 and 3 by simulation using another target image. Comparative Example 1 is based on Deep Image Prior technology. Comparative Example 3 is based on Non-LocalMeans (template size 5×5). The images used as the target image and the original image are fluorescence images captured by a digital CMOS camera (ORCA-Fusion) manufactured by Hamamatsu Photonics K.K.

図２７は、元画像を示す図である。図２７（ｂ）は、図２７（ａ）の一部を拡大して示す図である。この図は、注目領域Ｒ５を矩形枠で示している。この画像は、露光時間を十分に長い２００ｍｓとして撮像により取得されたものであり、ＳＮ比が良い元画像として扱うことができる。 FIG. 27 is a diagram showing an original image. FIG.27(b) is a figure which expands and shows a part of Fig.27 (a). This figure shows the attention area R5 with a rectangular frame. This image was obtained by imaging with a sufficiently long exposure time of 200 ms, and can be treated as an original image with a good SN ratio.

図２８は、対象画像Ａを示す図である。図２８（ｂ）は、図２８（ａ）の一部を拡大して示す図である。この画像は、露光時間を十分に短い６ｍｓとして撮像により取得されたものであり、ＳＮ比が悪い対象画像Ａとして扱うことができる。 28 is a diagram showing a target image A. FIG. FIG.28(b) is a figure which expands and shows a part of Fig.28 (a). This image was obtained by imaging with a sufficiently short exposure time of 6 ms, and can be treated as the target image A with a poor SN ratio.

図２９は、実施例により得られた出力画像Ｄを示す図である。図２９（ｂ）は、図２９（ａ）の一部を拡大して示す図である。図３０は、比較例１により得られた画像を示す図である。図３０（ｂ）は、図３０（ａ）の一部を拡大して示す図である。図３１は、比較例３により得られた画像を示す図である。図３１（ｂ）は、図３１（ａ）の一部を拡大して示す図である。 FIG. 29 is a diagram showing an output image D obtained by the example. FIG.29(b) is a figure which expands and shows a part of FIG.29(a). 30 is a diagram showing an image obtained by Comparative Example 1. FIG. FIG.30(b) is a figure which expands and shows a part of Fig.30 (a). 31 is a diagram showing an image obtained by Comparative Example 3. FIG. FIG.31(b) is a figure which expands and shows a part of Fig.31 (a).

図３２および図３３は、注目領域Ｒ５の各画像を更に拡大して示す図である。図３２（ａ）は元画像を示す図である。この画像のノイズ分布の標準偏差は６.４６７であった。図３２（ｂ）は対象画像Ａを示す図である。この画像のノイズ分布の標準偏差は３.９７９であった。図３３（ａ）は実施例により得られた出力画像Ｄを示す図である。この画像のノイズ分布の標準偏差は０.０２９であった。図３３（ｂ）は比較例１により得られた画像を示す図である。この画像のノイズ分布の標準偏差は０.２１４であった。図３３（ｃ）は比較例３により得られた画像を示す図である。この画像のノイズ分布の標準偏差は０.９０１であった。 32 and 33 are diagrams showing further enlarged images of the attention area R5. FIG. 32(a) is a diagram showing an original image. The standard deviation of the noise distribution for this image was 6.467. FIG. 32(b) is a diagram showing the target image A. FIG. The standard deviation of the noise distribution for this image was 3.979. FIG. 33(a) is a diagram showing an output image D obtained by the example. The standard deviation of the noise distribution for this image was 0.029. 33(b) is a diagram showing an image obtained by Comparative Example 1. FIG. The standard deviation of the noise distribution for this image was 0.214. FIG. 33(c) is a diagram showing an image obtained by Comparative Example 3. FIG. The standard deviation of the noise distribution for this image was 0.901.

図２７～図３３に示されるとおり、実施例および比較例１，３のうち、実施例により得られた出力画像Ｄは、ノイズ分布の標準偏差が最も小さく、元画像に最も近いものであった。 As shown in FIGS. 27 to 33, among the example and comparative examples 1 and 3, the output image D obtained by the example has the smallest standard deviation of noise distribution and is closest to the original image. .

１…画像処理装置、１０…第１処理部、１１…第１ＣＮＮ部、１２…第１評価部、２０…第２処理部、２１…第２ＣＮＮ部、２２…第２評価部、２３…画像選択部、３０…記録媒体。 REFERENCE SIGNS LIST 1 image processing device 10 first processing unit 11 first CNN unit 12 first evaluation unit 20 second processing unit 21 second CNN unit 22 second evaluation unit 23 image selection Part 30... Recording medium.

Claims

For each of a plurality of random noise images, a convolutional neural network is repeatedly trained using the random noise image as an input image and the target image as a teacher image, and after the iterative learning, the image output from the convolutional neural network is acquired as an intermediate image. a first processing unit;
a second processing unit that creates an output image in which noise is reduced from the target image based on the plurality of intermediate images acquired by the first processing unit;
An image processing device comprising:

The first processing unit terminates the iterative learning when a loss function value representing a difference between the image output from the convolutional neural network and the target image in the process of the iterative learning falls within a target range. and acquiring an image output from the convolutional neural network at the end thereof as the intermediate image;
The image processing apparatus according to claim 1.

The first processing unit uses a non-encoder/decoder type as the convolutional neural network,
The image processing apparatus according to claim 1 or 2.

The first processing unit performs any two or more of the processes of acquiring the intermediate image for each of the plurality of random noise images in parallel,
The image processing apparatus according to any one of claims 1 to 3.

The second processing unit uses one of the plurality of intermediate images as an input image and one of the others as a teacher image, and trains a convolutional neural network by a Noise2Noise method to create the output image.
The image processing device according to any one of claims 1 to 4.

The second processing unit creates the output image by averaging the plurality of intermediate images.
The image processing device according to any one of claims 1 to 4.

For each of a plurality of random noise images, a convolutional neural network is repeatedly trained using the random noise image as an input image and the target image as a teacher image, and after the iterative learning, the image output from the convolutional neural network is acquired as an intermediate image. a first processing step;
a second processing step of creating an output image in which noise is reduced from the target image based on the plurality of intermediate images obtained in the first processing step;
An image processing method comprising:

In the first processing step, the iterative learning is terminated when a loss function value representing a difference between the image output from the convolutional neural network and the target image in the process of the iterative learning falls within a target range. and acquiring an image output from the convolutional neural network at the end thereof as the intermediate image;
The image processing method according to claim 7.

In the first processing step, using a non-encoder-decoder type as the convolutional neural network;
9. The image processing method according to claim 7 or 8.

In the first processing step, any two or more of the processing of obtaining the intermediate image for each of the plurality of random noise images are performed in parallel;
The image processing method according to any one of claims 7 to 9.

In the second processing step, one of the plurality of intermediate images is used as an input image and one of the other images is used as a teacher image, and a convolutional neural network is trained by a Noise2Noise method to create the output image.
The image processing method according to any one of claims 7 to 10.

creating the output image by averaging the plurality of intermediate images in the second processing step;
The image processing method according to any one of claims 7 to 10.

An image processing program for causing a computer to execute the first processing step and the second processing step of the image processing method according to any one of claims 7 to 12.

14. A computer-readable recording medium recording the image processing program according to claim 13.