JP7301467B2

JP7301467B2 - Image Interpretation Support System and Image Interpretation Support Program

Info

Publication number: JP7301467B2
Application number: JP2019076504A
Authority: JP
Inventors: 充芦澤
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2019-04-12
Filing date: 2019-04-12
Publication date: 2023-07-03
Anticipated expiration: 2039-04-12
Also published as: JP2020173720A

Description

本発明は、画像に映っている物品の種類を判別する技術に関するものである。 TECHNICAL FIELD The present invention relates to a technique for discriminating the type of article appearing in an image.

リモートセンシング画像等の判読において、業務を省力化したり個人の技量差を解消したりする目的で、たとえば特許文献１に開示されているようなシステムが用いられる。
このシステムは、入力された画像中の対象物の位置および種類を出力する。
ここで、対象物とは、ユーザが、位置を特定したり型名を識別したりしたい物品である。 2. Description of the Related Art In interpreting remote sensing images, for example, a system as disclosed in Japanese Unexamined Patent Application Publication No. 2002-100002 is used for the purpose of saving labor and eliminating differences in individual skills.
This system outputs the location and type of objects in the input image.
Here, the object is an article whose position or model name the user wants to identify.

このようなシステムを実現する手段として、非特許文献１に挙げるようなニューラルネットワークが有効である。
ニューラルネットワークは、教師あり学習によって出力の正確さを向上できる。但し、十分な数量の学習データを確保する必要がある。
十分な数量の学習データを現実の計測によって確保できない場合、非特許文献２に開示された手法によって学習データを補てんすることができる。この手法は、シミュレーションによって模擬画像を生成して学習データを補填する。 As means for realizing such a system, a neural network as described in Non-Patent Document 1 is effective.
Neural networks can improve the accuracy of their output through supervised learning. However, it is necessary to secure a sufficient amount of learning data.
If a sufficient amount of learning data cannot be secured by actual measurement, the method disclosed in Non-Patent Document 2 can compensate for the learning data. This method generates a simulated image by simulation and supplements the learning data.

特開２０１０－２７１８４５号公報JP 2010-271845 A

“ＩｍａｇｅＮｅｔＣｌａｓｓｉｆｉｃａｔｉｏｎｗｉｔｈＤｅｅｐＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋｓ”，ＡｌｅｘＫｒｉｚｈｅｖｓｋｙ，ＩｌｙａＳｕｔｓｋｅｖｅｒ，ＧｅｏｆｆｒｅｙＥ．Ｈｉｎｔｏｎ，ＡｄｖａｎｃｅｓｉｎＮｅｕｒａｌＩｎｆｏｒｍａｔｉｏｎＰｒｏｃｅｓｓｉｎｇＳｙｓｔｅｍｓ２５（ＮＩＰＳ），２０１２，ｐｐ．１１０６－１１１４"ImageNet Classification with Deep Convolutional Neural Networks", Alex Krizhevsky, Ilya Sutskever, Geoffrey E.; Hinton, Advances in Neural Information Processing Systems 25 (NIPS), 2012, pp. 1106-1114 “ＬｅａｒｎｉｎｇｆｒｏｍＳｉｍｕｌａｔｅｄａｎｄＵｎｓｕｐｅｒｖｉｓｅｄＩｍａｇｅｓｔｈｒｏｕｇｈＡｄｖｅｒｓａｒｉａｌＴｒａｉｎｉｎｇ”，ＡｓｈｉｓｈＳｈｒｉｖａｓｔａｖａ，ＴｏｍａｓＰｆｉｓｔｅｒ，ＯｎｃｅｌＴｕｚｅｌ，ＪｏｓｈＳｕｓｓｋｉｎｄ，ＷｅｎｄａＷａｎｇ，ＲｕｓｓＷｅｂｂ，ＴｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＣＶＰＲ），２０１７，ｐｐ．２１０７－２１１６"Learning from Simulated and Unsupervised Images through Adversarial Training", Ashish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, R. uss Webb, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2107-2116

ニューラルネットワークの出力の誤りを訂正したりニューラルネットワークの出力の正しさを保証したりする目的で、ユーザはニューラルネットワークの出力の妥当性を確認する作業を実施する。
妥当性の確認において、ユーザは、ニューラルネットから出力された識別結果に基づき、対象物のイメージの想起を実施する。識別結果は、画像中の対象物の種類を示す。
対象物の見え方は、観測する方向によって異なる。そのため、イメージの想起は、観測する方向を脳内で変化させながら反復して実施する必要がある。この反復が原因となって、妥当性の確認の認知負荷が高くなっている。
このように、妥当性を確認する作業の負荷が高いため、妥当性を確認する作業を高精度に多数回実施することが難しい。 For the purpose of correcting errors in the output of the neural network or guaranteeing the correctness of the output of the neural network, the user carries out work to confirm the validity of the output of the neural network.
In validation, the user recalls the image of the object based on the identification results output from the neural network. The identification result indicates the type of object in the image.
The appearance of an object differs depending on the observation direction. Therefore, it is necessary to recollect images repeatedly while changing the direction of observation in the brain. This repetition contributes to the high cognitive load of validation.
In this way, since the work load for checking the validity is high, it is difficult to perform the work for checking the validity many times with high accuracy.

本発明は、画像の判読において、ニューラルネットワークの出力の妥当性を確認する作業の負荷を軽減することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to reduce the work load of confirming the validity of the output of a neural network in image interpretation.

本発明の画像判読支援システムは、
実画像に映っている対象物の種類と前記対象物に対する撮像の方向とをニューラルネットワークを利用して推定する推定部と、
推定された種類と推定された方向とに基づいて撮像シミュレーションを行うことによって、前記対象物と同じ種類の物品を前記推定された方向と同じ方向から撮像することによって得られる推定画像を生成する推定画像生成部と、
前記実画像と前記推定画像とをディスプレイに表示する表示部とを備える。 The image interpretation support system of the present invention is
an estimating unit that uses a neural network to estimate the type of an object appearing in an actual image and the imaging direction of the object;
Estimation for generating an estimated image obtained by imaging an article of the same type as the target from the same direction as the estimated direction by performing an imaging simulation based on the estimated type and the estimated direction an image generator;
a display unit for displaying the actual image and the estimated image on a display.

本発明によれば、ニューラルネットワークの出力（推定された種類、推定された方向）に基づく推定画像が表示される。これにより、画像の判読において、ニューラルネットワークの出力の妥当性を確認する作業の負荷が軽減される。 According to the present invention, an estimated image based on the output of the neural network (estimated type, estimated direction) is displayed. This reduces the work load of confirming the validity of the output of the neural network in image interpretation.

実施の形態１における画像判読支援システム１００の構成図。1 is a configuration diagram of an image interpretation support system 100 according to Embodiment 1. FIG. 実施の形態１における画像判読支援装置２００の構成図。1 is a configuration diagram of an image interpretation support device 200 according to Embodiment 1. FIG. 実施の形態１における画像判読支援方法のフローチャート。4 is a flowchart of an image interpretation support method according to Embodiment 1; 実施の形態１における学習処理（Ｓ１１０）のフローチャート。4 is a flowchart of learning processing (S110) according to the first embodiment; 実施の形態１における推定処理（Ｓ１２０）のフローチャート。4 is a flowchart of estimation processing (S120) according to the first embodiment; 実施の形態１における支援処理（Ｓ１３０）のフローチャート。4 is a flowchart of support processing (S130) according to the first embodiment;

実施の形態および図面において、同じ要素または対応する要素には同じ符号を付している。説明した要素と同じ符号が付された要素の説明は適宜に省略または簡略化する。図中の矢印はデータの流れ又は処理の流れを主に示している。 The same or corresponding elements are denoted by the same reference numerals in the embodiments and drawings. Descriptions of elements having the same reference numerals as those described will be omitted or simplified as appropriate. Arrows in the figure mainly indicate the flow of data or the flow of processing.

実施の形態１．
画像判読支援システム１００について、図１から図６に基づいて説明する。 Embodiment 1.
The image interpretation support system 100 will be described with reference to FIGS. 1 to 6. FIG.

＊＊＊構成の説明＊＊＊
図１に基づいて、画像判読支援システム１００の構成を説明する。
画像判読支援システム１００は、ニューラルネットワークを利用して画像を判読するシステムであり、判読結果の妥当性を確認する業務においてユーザの負荷を軽減する。 *** Configuration description ***
The configuration of the image interpretation support system 100 will be described based on FIG.
The image interpretation support system 100 is a system that uses a neural network to interpret images, and reduces the user's load in the task of confirming the validity of interpretation results.

画像判読支援システム１００は、ディスプレイ１０１と画像判読支援装置２００とを備える。
画像判読支援装置２００は、判読結果の妥当性の確認を支援するための画面をディスプレイ１０１に表示する。表示される画面については後述する。 The image interpretation support system 100 includes a display 101 and an image interpretation support device 200 .
The image interpretation support device 200 displays a screen on the display 101 for supporting confirmation of the validity of the interpretation result. The displayed screen will be described later.

図２に基づいて、画像判読支援装置２００の構成を説明する。
画像判読支援装置２００は、プロセッサ２０１とメモリ２０２と補助記憶装置２０３と通信装置２０４と入出力インタフェース２０５といったハードウェアを備えるコンピュータである。これらのハードウェアは、信号線を介して互いに接続されている。 The configuration of the image interpretation support device 200 will be described based on FIG.
The image interpretation support device 200 is a computer having hardware such as a processor 201 , a memory 202 , an auxiliary storage device 203 , a communication device 204 and an input/output interface 205 . These pieces of hardware are connected to each other via signal lines.

プロセッサ２０１は、演算処理を行うＩＣであり、他のハードウェアを制御する。例えば、プロセッサ２０１は、ＣＰＵ、ＤＳＰまたはＧＰＵである。
ＩＣは、ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔの略称である。
ＣＰＵは、ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔの略称である。
ＤＳＰは、ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒの略称である。
ＧＰＵは、ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔの略称である。 A processor 201 is an IC that performs arithmetic processing and controls other hardware. For example, processor 201 is a CPU, DSP or GPU.
IC is an abbreviation for Integrated Circuit.
CPU is an abbreviation for Central Processing Unit.
DSP is an abbreviation for Digital Signal Processor.
GPU is an abbreviation for Graphics Processing Unit.

メモリ２０２は揮発性の記憶装置である。メモリ２０２は、主記憶装置またはメインメモリとも呼ばれる。例えば、メモリ２０２はＲＡＭである。メモリ２０２に記憶されたデータは必要に応じて補助記憶装置２０３に保存される。
ＲＡＭは、ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙの略称である。 Memory 202 is a volatile storage device. Memory 202 is also referred to as main storage or main memory. For example, memory 202 is RAM. The data stored in memory 202 is saved in auxiliary storage device 203 as needed.
RAM is an abbreviation for Random Access Memory.

補助記憶装置２０３は不揮発性の記憶装置である。例えば、補助記憶装置２０３は、ＲＯＭ、ＨＤＤまたはフラッシュメモリである。補助記憶装置２０３に記憶されたデータは必要に応じてメモリ２０２にロードされる。
ＲＯＭは、ＲｅａｄＯｎｌｙＭｅｍｏｒｙの略称である。
ＨＤＤは、ＨａｒｄＤｉｓｋＤｒｉｖｅの略称である。 Auxiliary storage device 203 is a non-volatile storage device. For example, the auxiliary storage device 203 is ROM, HDD or flash memory. Data stored in the auxiliary storage device 203 is loaded into the memory 202 as needed.
ROM is an abbreviation for Read Only Memory.
HDD is an abbreviation for Hard Disk Drive.

通信装置２０４はレシーバ及びトランスミッタである。例えば、通信装置２０４は通信チップまたはＮＩＣである。
ＮＩＣは、ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄの略称である。 Communication device 204 is a receiver and transmitter. For example, communication device 204 is a communication chip or NIC.
NIC is an abbreviation for Network Interface Card.

入出力インタフェース２０５は、入力装置および出力装置が接続されるポートである。例えば、入出力インタフェース２０５はＵＳＢ端子であり、入力装置はキーボードおよびマウスであり、出力装置はディスプレイ１０１である。
ＵＳＢは、ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓの略称である。 The input/output interface 205 is a port to which an input device and an output device are connected. For example, the input/output interface 205 is a USB terminal, the input device is a keyboard and mouse, and the output device is the display 101 .
USB is an abbreviation for Universal Serial Bus.

画像判読支援装置２００は、学習部２１０と推定部２２０と支援部２３０といった要素を備える。これらの要素はソフトウェアで実現される。
学習部２１０は、模擬画像生成部２１１とパラメータ調整部２１２とを備える。
支援部２３０は、推定画像生成部２３１と表示部２３２とを備える。 The image interpretation support device 200 includes elements such as a learning unit 210 , an estimation unit 220 and a support unit 230 . These elements are implemented in software.
The learning section 210 includes a simulated image generation section 211 and a parameter adjustment section 212 .
The support unit 230 includes an estimated image generation unit 231 and a display unit 232 .

補助記憶装置２０３には、学習部２１０と推定部２２０と支援部２３０としてコンピュータを機能させるための画像判読支援プログラムが記憶されている。画像判読支援装プログラムは、メモリ２０２にロードされて、プロセッサ２０１によって実行される。
補助記憶装置２０３には、さらに、ＯＳが記憶されている。ＯＳの少なくとも一部は、メモリ２０２にロードされて、プロセッサ２０１によって実行される。
プロセッサ２０１は、ＯＳを実行しながら、画像判読支援プログラムを実行する。
ＯＳは、ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍの略称である。 The auxiliary storage device 203 stores an image interpretation support program for causing the computer to function as a learning unit 210 , an estimation unit 220 and a support unit 230 . The image interpretation aid program is loaded into memory 202 and executed by processor 201 .
The auxiliary storage device 203 further stores an OS. At least part of the OS is loaded into memory 202 and executed by processor 201 .
The processor 201 executes the image interpretation support program while executing the OS.
OS is an abbreviation for Operating System.

画像判読支援プログラムの入出力データは記憶部２９０に記憶される。例えば、記憶部２９０には、実画像２９１および三次元モデルファイル２９２などが記憶される。
メモリ２０２は記憶部２９０として機能する。但し、補助記憶装置２０３、プロセッサ２０１内のレジスタおよびプロセッサ２０１内のキャッシュメモリなどの記憶装置が、メモリ２０２の代わりに、又は、メモリ２０２と共に、記憶部２９０として機能してもよい。 The input/output data of the image interpretation support program are stored in the storage unit 290 . For example, the storage unit 290 stores an actual image 291, a three-dimensional model file 292, and the like.
Memory 202 functions as storage unit 290 . However, a storage device such as the auxiliary storage device 203 , a register within the processor 201 and a cache memory within the processor 201 may function as the storage unit 290 instead of or together with the memory 202 .

画像判読支援装置２００は、プロセッサ２０１を代替する複数のプロセッサを備えてもよい。複数のプロセッサは、プロセッサ２０１の役割を分担する。 The image interpretation support device 200 may include multiple processors that substitute for the processor 201 . A plurality of processors share the role of processor 201 .

画像判読支援プログラムは、光ディスクまたはフラッシュメモリ等の不揮発性の記録媒体にコンピュータ読み取り可能に記録（格納）することができる。 The image reading support program can be recorded (stored) in a non-volatile recording medium such as an optical disc or flash memory in a computer-readable manner.

＊＊＊動作の説明＊＊＊
画像判読支援装置２００の動作は画像判読支援方法に相当する。また、画像判読支援方法の手順は画像判読支援プログラムの手順に相当する。 ***Description of operation***
The operation of the image interpretation support device 200 corresponds to the image interpretation support method. Also, the procedure of the image interpretation support method corresponds to the procedure of the image interpretation support program.

図３に基づいて、画像判読支援方法の概要を説明する。
ステップＳ１１０において、学習部２１０は、ニューラルネットワーク１２０の内部パラメータ１２１を教師あり学習によって調整する。ニューラルネットワーク１２０は、画像を入力として判読を支援するための情報を推定するために利用されるニューラルネットワークである。内部パラメータは、ニューラルネットワークの入力に対する推定結果を左右する数値パラメータである。具体的には、内部パラメータは、ニューラルネットワークを構成するユニットのバイアス、及び、ニューラルネットワークを構成するコネクションの重みである。
ステップＳ１２０において、推定部２２０は、実画像２９１に映っている対象物の種類と対象物に対する撮像の方向とをニューラルネットワーク１２０を利用して推定する。対象物は、実画像２９１に映っている物品である。
ステップＳ１３０において、支援部２３０は、実画像２９１と推定結果に基づく推定画像１３２とをディスプレイ１０１に表示する。 Based on FIG. 3, the outline of the image interpretation support method will be described.
In step S110, the learning unit 210 adjusts the internal parameters 121 of the neural network 120 by supervised learning. The neural network 120 is a neural network that is used for estimating information for assisting interpretation with an image as an input. An internal parameter is a numerical parameter that influences the estimation result for the input of the neural network. Specifically, the internal parameters are the biases of the units that make up the neural network and the weights of the connections that make up the neural network.
In step S120, the estimation unit 220 uses the neural network 120 to estimate the type of object appearing in the real image 291 and the imaging direction of the object. A target object is an article shown in the real image 291 .
In step S130 , the support unit 230 displays the actual image 291 and the estimated image 132 based on the estimation result on the display 101 .

図４に基づいて、学習処理（Ｓ１１０）の詳細を説明する。
ステップＳ１１１において、模擬画像生成部２１１は、物品の種類を決定し、種類情報１１１を生成する。
種類情報１１１は、決定された種類を示す情報である。物品の種類は識別子によって識別される。
具体的には、模擬画像生成部２１１は、物品の種類をランダムに決定する。 Based on FIG. 4, the details of the learning process (S110) will be described.
In step S111 , the simulated image generation unit 211 determines the type of article and generates type information 111 .
The type information 111 is information indicating the determined type. The item type is identified by an identifier.
Specifically, the simulated image generator 211 randomly determines the type of article.

ステップＳ１１２において、模擬画像生成部２１１は、物品に対する撮像の方向を決定し、方向情報１１２を生成する。
物品に対する撮像の方向は、物品を観測する方向に相当する。
方向情報１１２は、決定された方向を示す情報である。撮像の方向は識別子によって識別される。
具体的には、模擬画像生成部２１１は、撮像の方向をランダムに決定する。 In step S112 , the simulated image generation unit 211 determines the imaging direction of the article and generates direction information 112 .
The imaging direction with respect to the article corresponds to the observation direction of the article.
Direction information 112 is information indicating the determined direction. The direction of imaging is identified by an identifier.
Specifically, the simulated image generator 211 randomly determines the imaging direction.

ステップＳ１１３において、模擬画像生成部２１１は、三次元モデルファイル２９２を検索することによって、三次元モデル１１３を三次元モデルファイル２９２から取得する。
三次元モデルファイル２９２は、物品の種類と物品の三次元モデルとが互いに対応付けられたファイルであり、記憶部２９０に予め記憶される。三次元モデルファイル２９２には、複数の種類情報と複数の三次元モデルとが互いに対応付けられている。三次元モデルは、物品の三次元形状を表すデータである。具体的には、三次元モデルは、物品の表面を複数の多角形の組合せで表現した際の頂点の番号、Ｘ座標、Ｙ座標及びＺ座標、並びに、多角形を構成する頂点の番号の組合せから成るデータである。
三次元モデル１１３は、種類情報１１１に示される種類と同じ種類に対応付けられた三次元モデルである。 In step S113 , the simulated image generation unit 211 acquires the three-dimensional model 113 from the three-dimensional model file 292 by searching the three-dimensional model file 292 .
The 3D model file 292 is a file in which the type of article and the 3D model of the article are associated with each other, and is stored in advance in the storage unit 290 . In the three-dimensional model file 292, a plurality of types of information and a plurality of three-dimensional models are associated with each other. A three-dimensional model is data representing the three-dimensional shape of an article. Specifically, a three-dimensional model is a combination of vertex numbers, X coordinates, Y coordinates, and Z coordinates when the surface of an article is represented by a combination of a plurality of polygons, and the numbers of vertices that make up the polygons. The data consists of
The 3D model 113 is a 3D model associated with the same type as the type indicated in the type information 111 .

ステップＳ１１４において、模擬画像生成部２１１は、三次元モデル１１３と方向情報１１２に示される方向とに基づいて、撮像シミュレーションを行う。これにより、模擬画像１１４が生成される。
模擬画像１１４は、種類情報１１１に示される種類と同じ種類の物品を方向情報１１２に示される方向と同じ方向から撮像することによって得られる画像に相当する。
撮像シミュレーションは、撮像をシミュレートすることによって画像を生成する処理である。具体的には、撮像シミュレーションは、光線追跡法等の手法で、物品の表面で反射する電磁波に対して撮像素子における計測値を求める。 In step S114 , the simulated image generator 211 performs imaging simulation based on the three-dimensional model 113 and the direction indicated by the direction information 112 . Thereby, a simulated image 114 is generated.
The simulated image 114 corresponds to an image obtained by imaging an article of the same type as that indicated by the type information 111 from the same direction as that indicated by the direction information 112 .
Imaging simulation is the process of generating an image by simulating imaging. Specifically, in the imaging simulation, a method such as a ray tracing method is used to obtain a measurement value of the electromagnetic waves reflected on the surface of the article in the imaging device.

ステップＳ１１５において、パラメータ調整部２１２は、ニューラルネットワーク１２０の内部パラメータ１２１に対する教師あり学習を行う。
教師あり学習において、パラメータ調整部２１２は、模擬画像１１４を入力データとして使用すると共に種類情報１１１に示される種類と方向情報１１２に示される方向とを教師データとして使用する。
これにより、ニューラルネットワーク１２０の内部パラメータ１２１が調整される。 In step S115 , the parameter adjustment unit 212 performs supervised learning on the internal parameters 121 of the neural network 120 .
In supervised learning, the parameter adjustment unit 212 uses the simulated image 114 as input data, and uses the type indicated by the type information 111 and the direction indicated by the direction information 112 as teacher data.
This adjusts the internal parameters 121 of the neural network 120 .

具体的には、パラメータ調整部２１２は、ニューラルネットワーク１２０によって推定される種類が種類情報１１１に示される種類と一致し、且つ、ニューラルネットワーク１２０によって推定される方向が方向情報１１２に示される方向と一致するように、内部パラメータ１２１を調整する。 Specifically, the parameter adjustment unit 212 determines that the type estimated by the neural network 120 matches the type indicated by the type information 111 and the direction estimated by the neural network 120 is the direction indicated by the direction information 112. Adjust internal parameters 121 to match.

図５に基づいて、推定処理（Ｓ１２０）の詳細を説明する。
ニューラルネットワーク１２０には、調整後の内部パラメータ１２１が設定されている。 Based on FIG. 5, the details of the estimation process (S120) will be described.
Neural network 120 is set with internal parameters 121 after adjustment.

ステップＳ１２１において、推定部２２０は、実画像２９１に映っている対象物の種類と対象物に対する撮像の方向とをニューラルネットワーク１２０を利用して推定する。これにより、種類推定結果１２３と方向推定結果１２４とが得られる。
実画像２９１は、対象物を撮像することによって得られた画像であり、記憶部２９０に予め記憶される。または、実画像２９１は、ユーザによって画像判読支援装置２００に入力される。
対象物に対する撮像の方向は、対象物を観測する方向に相当する。
種類推定結果１２３は、推定された種類を示す情報である。
方向推定結果１２４は、推定された方向を示す情報である。 In step S121 , the estimation unit 220 uses the neural network 120 to estimate the type of object appearing in the real image 291 and the imaging direction of the object. As a result, a type estimation result 123 and a direction estimation result 124 are obtained.
A real image 291 is an image obtained by imaging an object, and is pre-stored in the storage unit 290 . Alternatively, the actual image 291 is input to the image interpretation support device 200 by the user.
The imaging direction with respect to the object corresponds to the direction in which the object is observed.
The type estimation result 123 is information indicating the estimated type.
The direction estimation result 124 is information indicating the estimated direction.

図６に基づいて、支援処理（Ｓ１３０）の詳細を説明する。
ステップＳ１３１において、推定画像生成部２３１は、三次元モデルファイル２９２を検索することによって、三次元モデル１３１を三次元モデルファイル２９２から取得する。
三次元モデル１３１は、種類推定結果１２３に示される種類と同じ種類に対応付けられた三次元モデルである。 Based on FIG. 6, the details of the support process (S130) will be described.
In step S131 , the estimated image generation unit 231 acquires the 3D model 131 from the 3D model file 292 by searching the 3D model file 292 .
The 3D model 131 is a 3D model associated with the same type as the type indicated in the type estimation result 123 .

ステップＳ１３２において、推定画像生成部２３１は、三次元モデル１３１と方向推定結果１２４に示される方向とに基づいて、撮像シミュレーションを行う。これにより、推定画像１３２が生成される。
撮像シミュレーションは、ステップＳ１１４で行われる処理と同じである。
推定画像１３２は、種類推定結果１２３に示される種類と同じ種類の物品を方向推定結果１２４に示される方向と同じ方向から撮像することによって得られる画像に相当する。 In step S132 , the estimated image generation unit 231 performs imaging simulation based on the three-dimensional model 131 and the direction indicated by the direction estimation result 124 . An estimated image 132 is thereby generated.
The imaging simulation is the same as the processing performed in step S114.
The estimated image 132 corresponds to an image obtained by imaging an article of the same kind as the kind indicated by the kind estimation result 123 from the same direction as the direction indicated by the direction estimation result 124 .

推定画像１３２は、ニューラルネットワーク１２０の出力（種類推定結果１２３、方向推定結果１２４）に基づいて生成される。そのため、推定画像１３２はニューラルネットワーク１２０の出力を画像の形式で表現したものである、と言える。 The estimated image 132 is generated based on the output of the neural network 120 (type estimation result 123, direction estimation result 124). Therefore, it can be said that the estimated image 132 is a representation of the output of the neural network 120 in the form of an image.

ステップＳ１３３において、表示部２３２は、実画像２９１と推定画像１３２とをディスプレイ１０１に表示する。
具体的には、表示部２３２は、確認画面をディスプレイ１０１に表示する。確認画面は、実画像２９１と推定画像１３２とが並べて表示される画面である。 In step S133 , the display unit 232 displays the actual image 291 and the estimated image 132 on the display 101 .
Specifically, display unit 232 displays a confirmation screen on display 101 . The confirmation screen is a screen in which the actual image 291 and the estimated image 132 are displayed side by side.

＊＊＊実施の形態１の効果＊＊＊
画像判読支援装置２００は、ニューラルネットワーク１２０の出力を画像化して推定画像１３２を生成し、実画像２９１と推定画像１３２とを並べてディスプレイ１０１に表示することができる。
そして、利用者は、実画像２９１と推定画像１３２を照合することによって、ニューラルネットワーク１２０の出力の妥当性を確認することができる。つまり、利用者は、対象物を観測する方向を変化させながら対象物のイメージを想起することなく、ニューラルネットワーク１２０の出力の妥当性を確認することができる。 *** Effect of Embodiment 1 ***
The image interpretation support device 200 can image the output of the neural network 120 to generate an estimated image 132 and display the actual image 291 and the estimated image 132 side by side on the display 101 .
The user can confirm the validity of the output of the neural network 120 by comparing the actual image 291 and the estimated image 132 . In other words, the user can confirm the validity of the output of the neural network 120 without recalling the image of the object while changing the direction in which the object is observed.

＊＊＊実施の形態１の補足＊＊＊
実施の形態は、好ましい形態の例示であり、本発明の技術的範囲を制限することを意図するものではない。実施の形態は、部分的に実施してもよいし、他の形態と組み合わせて実施してもよい。フローチャート等を用いて説明した手順は、適宜に変更してもよい。 *** Supplement to Embodiment 1 ***
The embodiments are examples of preferred modes and are not intended to limit the technical scope of the present invention. Embodiments may be implemented partially or in combination with other embodiments. The procedures described using flowcharts and the like may be changed as appropriate.

画像判読支援装置２００は、複数の装置で実現されてもよい。例えば、三次元モデルファイル２９２が、外部のサーバ装置に記憶されていてもよい。この場合、画像判読支援装置２００は、外部のサーバ装置と通信することによって、三次元モデルファイル２９２にアクセスする。
画像判読支援装置２００の各要素は、ソフトウェア、ハードウェア、ファームウェアまたはこれらの組み合わせのいずれで実現されてもよい。
画像判読支援装置２００の要素である「部」は、「処理」または「工程」と読み替えてもよい。 Image interpretation support device 200 may be realized by a plurality of devices. For example, the 3D model file 292 may be stored in an external server device. In this case, the image interpretation support device 200 accesses the three-dimensional model file 292 by communicating with an external server device.
Each element of the image interpretation support device 200 may be implemented by software, hardware, firmware, or a combination thereof.
The “unit”, which is an element of the image interpretation support device 200, may be read as “processing” or “step”.

１００画像判読支援システム、１０１ディスプレイ、１１１種類情報、１１２方向情報、１１３三次元モデル、１１４模擬画像、１２０ニューラルネットワーク、１２１内部パラメータ、１２３種類推定結果、１２４方向推定結果、１３１三次元モデル、１３２推定画像、２００画像判読支援装置、２０１プロセッサ、２０２メモリ、２０３補助記憶装置、２０４通信装置、２０５入出力インタフェース、２１０学習部、２１１模擬画像生成部、２１２パラメータ調整部、２２０推定部、２３０支援部、２３１推定画像生成部、２３２表示部、２９０記憶部、２９１実画像、２９２三次元モデルファイル。 100 image interpretation support system, 101 display, 111 type information, 112 direction information, 113 three-dimensional model, 114 simulated image, 120 neural network, 121 internal parameter, 123 type estimation result, 124 direction estimation result, 131 three-dimensional model, 132 Estimated image 200 Image interpretation support device 201 Processor 202 Memory 203 Auxiliary storage device 204 Communication device 205 Input/output interface 210 Learning unit 211 Simulated image generation unit 212 Parameter adjustment unit 220 Estimation unit 230 Support Section 231 Estimated Image Generation Section 232 Display Section 290 Storage Section 291 Real Image 292 3D Model File.

Claims

a learning unit that adjusts internal parameters of the neural network by supervised learning;
an estimating unit for estimating the type of the object and the imaging direction for observing the object by using the neural network after the internal parameters have been adjusted;
From the three-dimensional model file in which the type of the article and the three-dimensional model representing the three-dimensional shape of the article are associated with each other, the same type as the type of the object shown in the actual image estimated by the estimation unit is associated. the real image estimated by the estimation unit by performing an imaging simulation based on the acquired three-dimensional model and the imaging direction in which the object is observed estimated by the estimation unit; an estimated image generation unit that generates an estimated image obtained by imaging an article of the same type as the type of the object shown in the image from the same direction as the imaging direction in which the object estimated by the estimation unit is observed;
The actual image and the estimated image are presented to a user who confirms the validity of the type of the object estimated by the estimation unit by matching the actual image and the estimated image and the imaging direction in which the object is observed. a display unit for displaying on the display;
An image interpretation support system.

a learning unit that adjusts internal parameters of the neural network by supervised learning;
an estimating unit for estimating the type of the object and the imaging direction for observing the object by using the neural network after the internal parameters have been adjusted;
From the three-dimensional model file in which the type of the article and the three-dimensional model representing the three-dimensional shape of the article are associated with each other, the same type as the type of the object shown in the actual image estimated by the estimation unit is associated. the real image estimated by the estimation unit by performing an imaging simulation based on the acquired three-dimensional model and the imaging direction in which the object is observed estimated by the estimation unit; an estimated image generation unit that generates an estimated image obtained by imaging an article of the same type as the type of the object shown in the image from the same direction as the imaging direction in which the object estimated by the estimation unit is observed;
The actual image and the estimated image are presented to a user who confirms the validity of the type of the object estimated by the estimation unit by matching the actual image and the estimated image and the imaging direction in which the object is observed. A display unit that displays on a display for supporting confirmation of the validity of the interpretation result of the actual image,
with
The learning unit
Determine the type of article and the imaging direction for observing the article,
Obtaining from the three-dimensional model file a three-dimensional model of an article associated with the same type as the determined type, and performing the imaging simulation based on the obtained three-dimensional model and the determined imaging direction, generating a simulated image obtained by imaging an article of the same type as the determined type from the same direction as the determined imaging direction;
An image interpretation support system that performs the supervised learning by using the simulated image as input data and using the determined type and the determined imaging direction as teacher data.

a learning unit that adjusts internal parameters of the neural network by supervised learning;
an estimating unit for estimating the type of an object, which is an article in an actual image, and the imaging direction in which the object is observed, using a neural network after the internal parameters have been adjusted;
The 3D model file in which the type of the article and the 3D model representing the shape of the article are associated with each other is associated with the same type as the type of the object appearing in the actual image estimated by the estimating unit. A three-dimensional model is obtained, and an imaging simulation is performed based on the obtained three-dimensional model and the imaging direction in which the object is observed , which is estimated by the estimation unit. an estimated image generation unit configured to generate an estimated image obtained by capturing an article of the same type as the type of the object being observed from the same direction as the imaging direction in which the object estimated by the estimation unit is observed;
The actual image and the estimated image are presented to a user who confirms the validity of the type of the object estimated by the estimation unit by matching the actual image and the estimated image and the imaging direction in which the object is observed. As a display unit that displays on the display,
An image interpretation support program that makes computers work.

a learning unit that adjusts internal parameters of the neural network by supervised learning;
an estimating unit for estimating the type of an object, which is an article in an actual image, and the imaging direction in which the object is observed, using a neural network after the internal parameters have been adjusted;
From the three-dimensional model file in which the type of the article and the three-dimensional model representing the three-dimensional shape of the article are associated with each other, the same type as the type of the object shown in the actual image estimated by the estimation unit is associated. the real image estimated by the estimation unit by performing an imaging simulation based on the acquired three-dimensional model and the imaging direction in which the object is observed estimated by the estimation unit; an estimated image generation unit that generates an estimated image obtained by imaging an article of the same type as the type of the object shown in the image from the same direction as the imaging direction in which the object estimated by the estimation unit is observed;
The actual image and the estimated image are presented to a user who confirms the validity of the type of the object estimated by the estimation unit by matching the actual image and the estimated image and the imaging direction in which the object is observed. As a display unit that displays on the display to support the confirmation of the validity of the interpretation result of the actual image,
An image interpretation support program for functioning a computer,
The learning unit
Determine the type of article and the imaging direction for observing the article,
A three-dimensional model associated with the same type as the determined type is obtained from the three-dimensional model file, and the imaging simulation is performed based on the obtained three-dimensional model and the determined imaging direction. generating a simulated image obtained by imaging an article of the same type as the type from the same direction as the determined imaging direction;
An image interpretation support program that performs the supervised learning by using the simulated image as input data and using the determined type and the determined imaging direction as teacher data.