JP2019200536A

JP2019200536A - Image acquisition apparatus, image acquisition method, and image acquisition program

Info

Publication number: JP2019200536A
Application number: JP2018093949A
Authority: JP
Inventors: 野中　修; Osamu Nonaka; 修野中; 岩城　秀和; Hidekazu Iwaki; 秀和岩城; 浩次酒井; Koji Sakai; 岡田圭司; Keiji Okada; 圭司岡田; 義久尾形; Yoshihisa Ogata
Original assignee: Olympus Corp
Current assignee: Olympus Corp
Priority date: 2018-05-15
Filing date: 2018-05-15
Publication date: 2019-11-21

Abstract

To improve the effectiveness of machine learning by recording information about the use of machine learning.SOLUTION: An image acquisition apparatus comprises: an image acquisition unit for acquiring images; an inference unit for performing inference using the image acquired by the image acquisition unit as an input with using a predetermined inference model; a presentation unit for presenting the inference result of the inference unit; a determination unit for determining whether or not the inference result is adopted; and a control unit for generating usage information regarding the use of the inference model on the basis of the determination result by the determination unit, and recording the generated usage information as metadata of the image acquired by the image acquisition unit.SELECTED DRAWING: Figure 1

Description

本発明は、機械学習を利用して画像を取得可能な画像取得装置、画像取得方法及び画像取得プログラムに関する。 The present invention relates to an image acquisition apparatus, an image acquisition method, and an image acquisition program that can acquire an image using machine learning.

近年、深層学習などの機械学習が利用されるようになってきた。機械学習は、既知の入力情報についてその特徴、時系列情報、空間情報等を学習し、学習結果に基づいて推論を行うことで、未知の事柄についての推論結果を得るものである。即ち、機械学習では、先ず特定の入力情報から、判定可能な出力結果を推論可能にするための学習済みモデルを得る。機械学習は、人工知能の要素技術の１つであり、そのアルゴリズムの１つにニューラルネットワークがあるが、これは人間の脳内が、ニューロンと呼ばれる神経細胞のネットワークを用いて画像などの諸所の特徴からその対象物の認識を行っていることを模倣した認識プロセスを実現したもので、この脳内の神経回路網やそれを使ったプロセスをコンピュータ上で再現しようとしている。また、深層学習は、これをさらに多層に構築したニューラルネットワークで、画像などの様々な特徴を入力とし、それが認識に役立つかを試行錯誤的に繰り返し、最も効果的な認識モデルを探し出し、学習するものである。 In recent years, machine learning such as deep learning has been used. Machine learning learns the characteristics, time-series information, spatial information, etc. of known input information and makes inferences based on the learning results to obtain inference results for unknown matters. That is, in machine learning, first, a learned model for enabling inference of a determinable output result is obtained from specific input information. Machine learning is one of the elemental technologies of artificial intelligence, and one of the algorithms is a neural network. This is because the human brain uses a network of nerve cells called neurons to provide various places such as images. It realizes a recognition process that mimics the recognition of the target object from the features of this, and tries to reproduce the neural network in the brain and the process using it on a computer. Deep learning is a neural network that is constructed in multiple layers. Using various features such as images as input, it is repeated trial and error to find out if it is useful for recognition, and the most effective recognition model is found and learned. To do.

高い信頼性で推論結果が得られるように、学習済みモデルの生成に際して、入力と出力との関係が既知の大量の情報が学習用データとして用いられる。例えば、ニューラルネットワークにより機械学習を実現する場合には、既知の入力に対して期待される出力が得られるようにネットワークのデザイン設計が行われる。このようなプロセスで得られた学習済モデル（以下、推論モデルともいう）は、学習を行ったニューラルネットワークから独立して利用可能である。 A large amount of information whose relation between input and output is known is used as learning data when generating a learned model so that an inference result can be obtained with high reliability. For example, when machine learning is realized by a neural network, a network design is designed so that an expected output is obtained with respect to a known input. A learned model (hereinafter also referred to as an inference model) obtained by such a process can be used independently of the learned neural network.

このような機械学習は様々な分野に利用可能であり、例えば、自然言語処理を行って情報を読出し推論を行うＷａｔｓｏｎ（商標）システム等がある。なお、特許文献１においては、Ｗａｔｓｏｎシステム等の質問応答システムにより生成された回答候補間の共通性を識別するための機構、方法、コンピュータ・プログラム、および装置を提供する技術が開示されている。 Such machine learning can be used in various fields. For example, there is a Watson (trademark) system that performs natural language processing to read information and perform inference. Patent Document 1 discloses a technique for providing a mechanism, a method, a computer program, and an apparatus for identifying commonality between answer candidates generated by a question answering system such as a Watson system.

また、例えば、撮影機器等による画像取得時において、このような機械学習を利用することも考えられる。機械学習の利用によって、ユーザが希望する画像の取得等が容易となることも考えられる。 Further, for example, it is conceivable to use such machine learning at the time of image acquisition by a photographing device or the like. The use of machine learning may facilitate acquisition of an image desired by the user.

特開２０１５−１０９０６８号公報Japanese Patent Laying-Open No. 2015-109068

しかし、推論モデルを使用したとしても、学習に用いたデータと推論する入力情報との差異によっては、期待通りの出力が得られないことがある。しかしながら、従来の画像取得システムにおいては、このような制約を考慮した上で、機械学習を有効に利用する手法については考えられていない。 However, even if an inference model is used, an expected output may not be obtained depending on the difference between the data used for learning and the input information to be inferred. However, in the conventional image acquisition system, a technique for effectively using machine learning is not considered in consideration of such restrictions.

本発明は、機械学習の使用についての情報を記録することにより、有効に機械学習による推論モデルを利用することができる画像取得装置、画像取得方法及び画像取得プログラムを提供することを目的とする。 It is an object of the present invention to provide an image acquisition device, an image acquisition method, and an image acquisition program that can effectively use an inference model based on machine learning by recording information about the use of machine learning.

本発明の一態様による画像取得装置は、画像を取得する画像取得部と、所定の推論モデルを用いて、上記画像取得部が取得した上記画像を入力とした推論を行う推論部と、上記推論部の推論結果を提示する提示部と、上記推論結果が採用されたか否かを判定する判定部と、上記判定部の判定結果に基づいて上記推論モデルの使用に関する使用情報を作成し、作成した使用情報を上記画像取得部が取得した画像のメタデータとして記録する制御部と、を具備する。 An image acquisition device according to an aspect of the present invention includes an image acquisition unit that acquires an image, an inference unit that performs an inference using the image acquired by the image acquisition unit as an input, using a predetermined inference model, and the inference The presenting part that presents the inference result of the part, the determination part that determines whether or not the inference result is adopted, and the usage information on the use of the inference model based on the determination result of the determination part are created and created A control unit that records usage information as metadata of the image acquired by the image acquisition unit.

本発明の一態様による画像取得方法は、画像を取得する画像取得ステップと、所定の推論モデルを用いて、上記画像取得ステップにおいて取得した上記画像を入力とした推論を行う推論ステップと、上記推論ステップにおける推論結果を提示する提示ステップと、上記推論結果が採用されたか否かを判定する判定ステップと、上記判定ステップにおける判定結果に基づいて上記推論モデルの使用に関する使用情報を作成し、作成した使用情報を上記画像取得ステップにおいて取得した画像のメタデータとして記録する制御ステップと、を具備する。 An image acquisition method according to an aspect of the present invention includes an image acquisition step of acquiring an image, an inference step of performing an inference using the image acquired in the image acquisition step as an input, using a predetermined inference model, and the inference The presenting step for presenting the inference result in the step, the determination step for determining whether or not the inference result is adopted, and the usage information on the use of the inference model based on the determination result in the determination step were created and created And a control step of recording usage information as metadata of the image acquired in the image acquisition step.

本発明の一態様による画像取得プログラムは、コンピュータに、画像を取得する画像取得ステップと、所定の推論モデルを用いて、上記画像取得ステップにおいて取得した上記画像を入力とした推論を行う推論ステップと、上記推論ステップにおける推論結果を提示する提示ステップと、上記推論結果が採用されたか否かを判定する判定ステップと、上記判定ステップにおける判定結果に基づいて上記推論モデルの使用に関する使用情報を作成し、作成した使用情報を上記画像取得ステップにおいて取得した画像のメタデータとして記録する制御ステップと、を実行させる。 An image acquisition program according to an aspect of the present invention includes: an image acquisition step for acquiring an image in a computer; an inference step for performing an inference using the image acquired in the image acquisition step as an input using a predetermined inference model; A presentation step for presenting the inference result in the inference step, a determination step for determining whether or not the inference result has been adopted, and use information on the use of the inference model based on the determination result in the determination step. And a control step of recording the created usage information as metadata of the image acquired in the image acquisition step.

本発明によれば、機械学習の使用についての情報を記録することにより、有効に機械学習による推論モデルを利用することができるという効果を有する。 According to the present invention, by recording information about the use of machine learning, there is an effect that an inference model based on machine learning can be used effectively.

本発明の第１の実施の形態に係る画像取得装置を含む撮像装置を示すブロック図。1 is a block diagram showing an imaging device including an image acquisition device according to a first embodiment of the present invention. 推論エンジン１２の記憶部１２ａに記憶されている辞書１２ａ１，１２ａ２を説明するための説明図。Explanatory drawing for demonstrating dictionary 12a1, 12a2 memorize | stored in the memory | storage part 12a of the inference engine 12. FIG. 画像取得装置１０の動作を示すフローチャート。7 is a flowchart showing the operation of the image acquisition device 10. 外部機器３０の動作を示すフローチャート。5 is a flowchart showing the operation of the external device 30. 第１の実施の形態の動作を説明するための説明図。Explanatory drawing for demonstrating operation | movement of 1st Embodiment. 第１の実施の形態の動作を説明するための説明図。Explanatory drawing for demonstrating operation | movement of 1st Embodiment. 本発明の第２の実施の形態において採用される動作フローを示すフローチャート。The flowchart which shows the operation | movement flow employ | adopted in the 2nd Embodiment of this invention. 本発明の第２の実施の形態において採用される動作フローを示すフローチャート。The flowchart which shows the operation | movement flow employ | adopted in the 2nd Embodiment of this invention. 図１の撮像装置２０により被写体を撮像する様子を示す説明図。FIG. 3 is an explanatory diagram illustrating a state in which a subject is imaged by the imaging device 20 of FIG. 1. 表示部１５の表示画面１５ａに表示される撮像画像を示す説明図。Explanatory drawing which shows the captured image displayed on the display screen 15a of the display part 15. FIG. 辞書メニュー画面を示す説明図。Explanatory drawing which shows a dictionary menu screen. 辞書メニュー画面を示す説明図。Explanatory drawing which shows a dictionary menu screen.

以下、図面を参照して本発明の実施の形態について詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（第１の実施の形態）
図１は本発明の第１の実施の形態に係る画像取得装置を含む撮像装置を示すブロック図である。
深層学習のような機械学習は、人間の脳内が、ニューロンと呼ばれる神経細胞のネットワークを用いて画像などの諸所の特徴からその対象物の認識を行っていることを模倣し、多層に構築してあるがゆえに、得られた「推論モデル」も入出力がブラックボックスの中で行われているような問題を生じるので、どのような「推論モデル」を使ったかの識別が重要になる。
本実施の形態は、所定の情報入力に際して、所定の推論モデルを用いた推論結果を採用したか否か、採用した場合にはいずれの推論モデルを採用したかを把握することによって、推論の有効性を向上させるものである。本実施の形態においては、このような推論モデルの使用に関する情報（以下、推論モデル使用情報という）を記録するようになっており、例えば、推論モデルが利用されるタイミングに同期した画像のメタデータとして推論モデル使用情報を記録するようになっている。なお、画像に限らず各種情報のメタデータとして推論モデル使用情報を記録してもよく、また、推論モデル使用情報を利用シーンを特定する情報、例えば時間情報と共に単独で記録するようになっていてもよい。 (First embodiment)
FIG. 1 is a block diagram showing an imaging apparatus including an image acquisition apparatus according to the first embodiment of the present invention.
Machine learning, such as deep learning, is built in multiple layers, imitating that the human brain recognizes its objects from features such as images using a network of neurons called neurons. Therefore, since the obtained “inference model” causes a problem that input / output is performed in a black box, it is important to identify which “inference model” is used.
In this embodiment, in inputting predetermined information, whether or not an inference result using a predetermined inference model is adopted, and if inferred, which inference model is adopted is effective for inference. It improves the performance. In the present embodiment, information on the use of such an inference model (hereinafter referred to as inference model use information) is recorded. For example, image metadata synchronized with the timing at which the inference model is used. Inference model usage information is recorded. The inference model use information may be recorded as metadata of various types of information, not limited to images, and the inference model use information is recorded alone together with information for specifying a use scene, for example, time information. Also good.

推論モデルの利用をユーザが設定できる場合がある。この場合には、推論モデル使用情報をエビデンスとして記録することによって、推論モデル使用情報は、推論モデルの利用に関するユーザの判断が正しいか否かの判定材料となり、また、推論モデルの有効性の判定材料ともなる。また、推論モデル使用情報を、推論モデルの適用範囲の明確化に利用することも可能である。 The user may be able to set the use of the inference model. In this case, by recording the inference model usage information as evidence, the inference model usage information can be used to determine whether or not the user's judgment regarding the use of the inference model is correct, and the validity of the inference model can be determined. It becomes a material. Inference model usage information can also be used to clarify the scope of application of an inference model.

図１の撮像装置２０は、画像取得装置１０を含んで構成される。撮像装置２０は被写体を撮像し、撮像装置２０中の画像取得装置１０は、撮像して得た画像を記録する。撮像装置２０としては、デジタルカメラやビデオカメラだけでなく、スマートフォンやタブレット端末に内蔵されるカメラを採用してもよい。更に、撮像装置２０としては、顕微鏡、内視鏡、ＣＴスキャナ等を採用してもよく、白色光、紫外線、赤外線、Ｘ線、超音波等を用いて画像を得る種々の撮像装置を採用することができる。なお、図１では撮像装置２０内に画像取得装置１０が構成される例を示しているが、撮像装置２０と画像取得装置１０とが別体に構成されるものであってもよい。 The imaging device 20 in FIG. 1 includes an image acquisition device 10. The imaging device 20 images a subject, and the image acquisition device 10 in the imaging device 20 records an image obtained by imaging. As the imaging device 20, not only a digital camera and a video camera but also a camera built in a smartphone or a tablet terminal may be adopted. Furthermore, as the imaging device 20, a microscope, an endoscope, a CT scanner, or the like may be adopted, and various imaging devices that obtain images using white light, ultraviolet rays, infrared rays, X-rays, ultrasonic waves, or the like are adopted. be able to. 1 illustrates an example in which the image acquisition device 10 is configured in the imaging device 20, the imaging device 20 and the image acquisition device 10 may be configured separately.

画像取得装置１０は、後述するように、画像取得に際して推論モデルを利用することができるようになっているが、画像取得装置１０は予め搭載されている推論モデルを用いて画像取得を行ってもよく、また、外部機器３０から推論モデルを取得するようになっていてもよい。即ち、外部機器３０は、必要に応じて用いられるものである。 As will be described later, the image acquisition apparatus 10 can use an inference model when acquiring an image. However, the image acquisition apparatus 10 may acquire an image using an inference model that is installed in advance. Alternatively, the inference model may be acquired from the external device 30. That is, the external device 30 is used as necessary.

撮像装置２０は、制御部１１及び撮像部２２を備えている。制御部１１は、ＣＰＵ等を用いたプロセッサによって構成されて、図示しないメモリに記憶されたプログラムに従って動作して各部を制御するものであってもよいし、ハードウェアの電子回路で機能の一部又は全部を実現するものであってもよい。 The imaging device 20 includes a control unit 11 and an imaging unit 22. The control unit 11 may be configured by a processor using a CPU or the like, and may operate in accordance with a program stored in a memory (not shown) to control each unit, or may be a part of a function by a hardware electronic circuit. Or you may implement | achieve all.

撮像部２２は、撮像素子２２ａ及び光学系２２ｂを有している。光学系２２ｂは、ズームやフォーカシングのための図示しないレンズや絞り等を備えている。光学系２２ｂは、これらのレンズを駆動する図示しないズーム（変倍）機構、ピント及び絞り機構を備えている。 The imaging unit 22 includes an imaging element 22a and an optical system 22b. The optical system 22b includes a lens and a diaphragm (not shown) for zooming and focusing. The optical system 22b includes a zoom (magnification changing) mechanism, a focus and a diaphragm mechanism (not shown) that drive these lenses.

撮像素子２２ａは、ＣＣＤやＣＭＯＳセンサ等によって構成されており、光学系２２ｂによって被写体光学像が撮像素子２２ａの撮像面に導かれるようになっている。撮像素子２２ａは、被写体光学像を光電変換して被写体の撮像画像（撮像信号）を取得する。 The image pickup device 22a is constituted by a CCD, a CMOS sensor, or the like, and an optical image of the subject is guided to the image pickup surface of the image pickup device 22a by the optical system 22b. The image sensor 22a photoelectrically converts the subject optical image to obtain a captured image (imaging signal) of the subject.

制御部１１の撮像制御部１１ａは、光学系２２ｂのズーム機構、ピント機構及び絞り機構を駆動制御して、ズーム、絞り及びピントを調節することができるようになっている。撮像部２２は、撮像制御部１１ａに制御されて撮像を行い、撮像画像（動画像及び静止画像）の撮像信号を画像取得部としての制御部１１に出力する。 The imaging control unit 11a of the control unit 11 can adjust the zoom, the aperture, and the focus by driving and controlling the zoom mechanism, the focus mechanism, and the aperture mechanism of the optical system 22b. The imaging unit 22 performs imaging under the control of the imaging control unit 11a, and outputs an imaging signal of a captured image (moving image and still image) to the control unit 11 serving as an image acquisition unit.

撮像装置２０には操作部１３が設けられている。操作部１３は、図示しないレリーズボタン、ファンクションボタン、撮影モード設定、パラメータ操作等の各種スイッチ、ダイヤル、リング部材等を含み、ユーザ操作に基づく操作信号を制御部１１に出力する。制御部１１は、操作部１３からの操作信号に基づいて、各部を制御するようになっている。 The imaging device 20 is provided with an operation unit 13. The operation unit 13 includes a release button, a function button (not shown), various switches for shooting mode setting, parameter operation, etc., a dial, a ring member, and the like, and outputs an operation signal based on a user operation to the control unit 11. The control unit 11 controls each unit based on an operation signal from the operation unit 13.

制御部１１は、撮像部２２からの撮像画像（動画像及び静止画像）を取込む。制御部１１の画像処理部１１ｂは、取込んだ撮像画像に対して、所定の信号処理、例えば、色調整処理、マトリックス変換処理、ノイズ除去処理、その他各種の信号処理を行う。 The control unit 11 captures a captured image (moving image and still image) from the imaging unit 22. The image processing unit 11b of the control unit 11 performs predetermined signal processing, for example, color adjustment processing, matrix conversion processing, noise removal processing, and other various signal processing on the captured image.

撮像装置２０には表示部１５が設けられており、制御部１１には、表示制御部１１ｆが設けられている。表示部１５は、例えば、ＬＣＤ（液晶表示装置）等の表示画面を有しており、この表示画面は撮像装置２０の筐体背面等に設けられる。表示制御部１１ｆは、画像処理部１１ｂによって信号処理された撮像画像を表示部１５に表示させるようになっている。また、表示制御部１１ｆは、撮像装置２０の各種メニュー表示や警告表示等を表示部１５に表示させることもできるようになっている。 The imaging device 20 is provided with a display unit 15, and the control unit 11 is provided with a display control unit 11 f. The display unit 15 has a display screen such as an LCD (Liquid Crystal Display), for example, and this display screen is provided on the rear surface of the housing of the imaging device 20 or the like. The display control unit 11f displays the captured image signal-processed by the image processing unit 11b on the display unit 15. Further, the display control unit 11f can display various menu displays, warning displays, and the like of the imaging device 20 on the display unit 15.

撮像装置２０には通信部１４が設けられており、制御部１１には、通信制御部１１ｅが設けられている。通信部１４は、通信制御部１１ｅに制御されて、外部機器３０との間で情報を送受することができるようになっている。通信部１４は、例えば、ブルートゥース（登録商標）等の近距離無線による通信及び例えば、Ｗｉ−Ｆｉ（登録商標）等の無線ＬＡＮによる通信が可能である。なお、通信部１４は、ブルートゥースやＷｉ−Ｆｉに限らず、各種通信方式での通信を採用することが可能である。通信制御部１１ｅは、通信部１４を介して、外部機器３０から推論モデルの情報を受信することができる。 The imaging device 20 is provided with a communication unit 14, and the control unit 11 is provided with a communication control unit 11 e. The communication unit 14 is controlled by the communication control unit 11 e so that information can be transmitted to and received from the external device 30. The communication unit 14 is capable of communication by short-range wireless such as Bluetooth (registered trademark) and wireless LAN such as Wi-Fi (registered trademark). Note that the communication unit 14 is not limited to Bluetooth or Wi-Fi, and can employ communication using various communication methods. The communication control unit 11 e can receive inference model information from the external device 30 via the communication unit 14.

制御部１１には記録制御部１１ｃが設けられている。記録制御部１１ｃは、信号処理後の撮像画像を圧縮処理し、圧縮後の画像を記録部１６に与えて記録させることができる。記録部１６は、所定の記録媒体によって構成されて、制御部１１から与えられた情報を記録すると共に、記録されている情報を制御部１１に出力することができる。記録部１６としては、例えばカードインターフェースを採用することができ、記録部１６はメモリカード等の記録媒体に画像データを記録可能である。 The control unit 11 is provided with a recording control unit 11c. The recording control unit 11c can compress the captured image after the signal processing and give the compressed image to the recording unit 16 for recording. The recording unit 16 is configured by a predetermined recording medium, and can record information given from the control unit 11 and output the recorded information to the control unit 11. As the recording unit 16, for example, a card interface can be adopted, and the recording unit 16 can record image data on a recording medium such as a memory card.

本実施の形態においては、記録部１６は、画像データ記録領域１６ａ及びメタデータ記録領域１６ｂを有しており、記録制御部１１ｃは、画像データを画像データ記録領域１６ａに記録するようになっている。また、記録制御部１１ｃは、メタデータ記録領域１６ｂに、推論モデル使用情報をメタデータとして記録するようになっている。なお、記録制御部１１ｃは、記録部１６に記録されている情報を読み出して再生することも可能である。 In the present embodiment, the recording unit 16 includes an image data recording area 16a and a metadata recording area 16b, and the recording control unit 11c records image data in the image data recording area 16a. Yes. The recording control unit 11c records the inference model usage information as metadata in the metadata recording area 16b. Note that the recording control unit 11c can also read and reproduce information recorded in the recording unit 16.

本実施の形態においては、撮像装置２０には、推論部としての推論エンジン１２が設けられている。推論エンジン１２は、記憶部１２ａを有しており、記憶部１２ａには、１つ以上の辞書（図１では２つの辞書１２ａ１，１２ａ２）が設けられている。各辞書１２ａ１，１２ａ２は、それぞれ機械学習における学習が完了することによって得られるネットワーク、即ち、推論モデルによって構成されている。なお、辞書１２ａ１，１２ａ２は割り当てられた辞書ＩＤによって識別可能であり、例えば外部機器３０から辞書を取り込む場合でも、辞書ＩＤによって必要な辞書のみを取り込むことができるようになっている。 In the present embodiment, the imaging apparatus 20 is provided with an inference engine 12 as an inference unit. The inference engine 12 includes a storage unit 12a, and one or more dictionaries (two dictionaries 12a1 and 12a2 in FIG. 1) are provided in the storage unit 12a. Each of the dictionaries 12a1 and 12a2 is configured by a network obtained by completing learning in machine learning, that is, an inference model. The dictionaries 12a1 and 12a2 can be identified by assigned dictionary IDs. For example, even when a dictionary is imported from the external device 30, only a necessary dictionary can be acquired by the dictionary ID.

図２は推論エンジン１２の記憶部１２ａに記憶されている辞書１２ａ１，１２ａ２を説明するための説明図である。図２において、所定のネットワークＮ１には入力Ａ及び出力Ｂに対応する大量のデータセットが学習用データとして与えられる。これにより、ネットワークＮ１は、入力Ａに対応する出力Ｂが得られるように、ネットワークデザインが決定される。なお、機械学習に採用するネットワークＮ１としては、公知の種々のネットワークを採用してもよい。例えば、ＣＮＮ（Convolution Neural Network）を利用したＲ−ＣＮＮ（Regions with CNN features）やＦＣＮ（Fully Convolutional Networks）等を用いてもよい。また、深層学習に限らず、公知の各種機械学習の手法を採用して推論モデルを取得してもよい。 FIG. 2 is an explanatory diagram for explaining the dictionaries 12 a 1 and 12 a 2 stored in the storage unit 12 a of the inference engine 12. In FIG. 2, a large amount of data sets corresponding to the input A and the output B are given to the predetermined network N1 as learning data. As a result, the network N1 is determined so that an output B corresponding to the input A is obtained. As the network N1 employed for machine learning, various known networks may be employed. For example, R-CNN (Regions with CNN features) using CNN (Convolution Neural Network), FCN (Fully Convolutional Networks), or the like may be used. In addition to the deep learning, an inference model may be acquired using various known machine learning techniques.

大量のデータセットをネットワークＮ１に与えることにより、入力Ａと出力Ｂとの関係に類似した入力と出力との関係が高い信頼性で得られるように、ネットワークＮ１はデザインが決定される。こうして、学習済みとなったネットワークＮ１が推論モデルＩＭ１として利用可能となる。 By providing a large amount of data set to the network N1, the design of the network N1 is determined so that the relationship between the input and the output similar to the relationship between the input A and the output B can be obtained with high reliability. Thus, the learned network N1 can be used as the inference model IM1.

推論エンジン１２の記憶部１２ａには、推論モデルＩＭ１に対応する辞書１２ａ１が記憶される。また、記憶部１２ａには、上述した入力Ａと出力Ｂとの関係とは異なる入出力関係を有する大量のデータセットが学習用データとして与えられたネットワークを用いることで得られた推論モデルに対応する辞書１２ａ２も記憶されている。 In the storage unit 12a of the inference engine 12, a dictionary 12a1 corresponding to the inference model IM1 is stored. The storage unit 12a corresponds to an inference model obtained by using a network in which a large amount of data sets having input / output relationships different from the relationship between the input A and the output B described above are provided as learning data. A dictionary 12a2 is also stored.

制御部１１には設定制御部１１ｄが設けられており、設定制御部１１ｄは、推論エンジン１２を制御して、推論エンジン１２を利用した推論を行わせることができるようになっている。制御部１１は、推論エンジン１２による推論結果によって、各部を制御するようになっていてもよい。例えば、推論エンジン１２がフォーカス制御の対象物の検出を行うものである場合には、推論エンジン１２は、撮像画像が与えられると、当該撮像画像中に対象物が存在するか否かを判定し、存在する場合にはその画像中の位置を制御部１１に出力する。この場合には、撮像制御部１１ａは、検出された対象物の位置においてピントを一致させるように、フォーカス制御を行う。 The control unit 11 is provided with a setting control unit 11d, and the setting control unit 11d can control the inference engine 12 to perform inference using the inference engine 12. The control unit 11 may control each unit based on the inference result of the inference engine 12. For example, when the inference engine 12 detects a target object for focus control, the inference engine 12 determines whether or not the target object exists in the captured image when the captured image is given. If it exists, the position in the image is output to the control unit 11. In this case, the imaging control unit 11a performs focus control so that the focus is matched at the position of the detected object.

本実施の形態においては、設定制御部１１ｄは、提示部としての表示制御部１１ｆを制御して、推論エンジン１２による推論の結果を表示部１５の表示画面上に表示させることができるようになっている。例えば、推論エンジン１２による推論によってフォーカス制御の対象となる対象物を検出する場合には、表示制御部１１ｆは、検出結果を認識させるための表示、例えば、検出した対象物を囲む枠表示を表示させるようになっていてもよい。 In the present embodiment, the setting control unit 11d can control the display control unit 11f as a presentation unit to display the result of inference by the inference engine 12 on the display screen of the display unit 15. ing. For example, when detecting an object to be focused by inference by the inference engine 12, the display control unit 11f displays a display for recognizing the detection result, for example, a frame display surrounding the detected object. You may come to let me.

なお、設定制御部１１ｄは、表示に限らず、推論エンジン１２による推論の結果を種々の方法でユーザに提示することができるようになっていてもよい。例えば、設定制御部１１ｄは、音声により推論結果を提示してもよく、或いは駆動部の機械的な制御によって推論結果を提示してもよい。 Note that the setting control unit 11d is not limited to display, but may be able to present the result of inference by the inference engine 12 to the user by various methods. For example, the setting control unit 11d may present the inference result by voice, or may present the inference result by mechanical control of the drive unit.

更に、本実施の形態においては、判定部としての設定制御部１１ｄは、操作部１３に対するユーザ操作に基づく判定或いは画像処理部１１ｂにより信号処理された撮像画像に対する画像解析による判定を行って、推論エンジン１２を利用した推論を採用するか否か、推論を採用する場合にはいずれの辞書を用いた推論を採用するかを判定し、設定することができるようになっている。
このように、操作部１３は、ユーザが特定の意思を反映して操作するものであるから、推論結果が、そのユーザにとって有効であったかどうかを判定する有力な情報となる。特に、パーソナルユースの機器においては、その機器を操作するユーザによる操作は、そのユーザの判断によるものと考えることができる。一方、様々な人が使う機器であっても、操作部１３に指紋認証を設ける、あるいは利用時の音声で声紋認証を行うなど、個人判定の機能を設ければ、同様の効果が期待できる。また、音声で操作する機器が増加しており、その場合、音声を取得して音声の内容を判定する機能が操作部となるが、この時は声紋認証の併用が容易になる。
また、操作部１３に対するどのような操作が、どのような推論結果を不採用とするかについては、設定制御部１１ｄは、予め登録されている情報を用いて判定してもよい。例えば、記録部１６には、推論機能、操作関係データベース１６ｃが設けられていてもよい。推論機能、操作関係データベース１６ｃは、推論結果がどのような制御に対応しているかや、その制御はどのような操作部と関連しているかのデータベースであり、設定制御部１１ｄは、推論機能、操作関係データベース１６ｃを参照することで、推論結果を採用又は不採用とする操作（以下、関連操作ともいう）を判定することができるようになっている。例えば、推論モデルがフォーカス機能を実現するためのものである場合には、フォーカス機能を達成するという推論機能の結果表示に対して、フォーカスリングを操作するという操作が行われた場合には、設定制御部１１ｄは、ユーザが推論結果を不採用にしたものと判定することができる。 Furthermore, in the present embodiment, the setting control unit 11d as a determination unit performs inference by performing determination based on a user operation on the operation unit 13 or determination based on image analysis on the captured image signal-processed by the image processing unit 11b. It is possible to determine and set whether to use inference using the engine 12 and which dictionary to use inference when using inference.
Thus, since the operation unit 13 is operated by the user reflecting a specific intention, the inference result is useful information for determining whether or not the inference result is effective for the user. In particular, in a personal use device, an operation by a user who operates the device can be considered to be based on the judgment of the user. On the other hand, even if the device is used by various people, the same effect can be expected by providing a function for individual determination such as providing fingerprint authentication on the operation unit 13 or performing voiceprint authentication using voice during use. In addition, the number of devices operated by voice is increasing, and in this case, the function of acquiring voice and determining the content of the voice is the operation unit, but at this time, voiceprint authentication can be easily used together.
In addition, the setting control unit 11d may determine which operation on the operation unit 13 does not adopt what inference result by using information registered in advance. For example, the recording unit 16 may be provided with an inference function and an operation relation database 16c. The inference function / operation relation database 16c is a database of what kind of control the inference result corresponds to and what operation unit the control is associated with. The setting control unit 11d has an inference function, By referring to the operation relation database 16c, it is possible to determine an operation that adopts or rejects the inference result (hereinafter also referred to as a related operation). For example, if the inference model is for realizing the focus function, if the operation to operate the focus ring is performed for the result display of the inference function to achieve the focus function, The control unit 11d can determine that the user has rejected the inference result.

また、設定制御部１１ｄは、この設定に関する情報（推論モデル使用情報）を記録制御部１１ｃに与えるようになっている。これにより、記録制御部１１ｃは、撮像部２２により得られた撮像画像に同期させて、推論モデル使用情報を撮像画像のメタデータとして記録部１６中のメタデータ記録領域１６ｂに記憶させるようになっている。こうして、推論モデルの利用に関するエビデンス記録が行われる。 In addition, the setting control unit 11d is configured to give information (inference model use information) regarding this setting to the recording control unit 11c. Thereby, the recording control unit 11c stores the inference model usage information in the metadata recording area 16b of the recording unit 16 as metadata of the captured image in synchronization with the captured image obtained by the imaging unit 22. ing. In this way, evidence is recorded on the use of the inference model.

撮像装置２０は、通信部１４を介して推論モデルを外部機器３０から取得することもできるようになっている。外部機器３０は、学習部３１及び外部画像データベース（ＤＢ）３２を有している。学習部３１は通信部３１ｂを有しており、外部画像ＤＢ３２は通信部３３を有している。通信部３１ｂ，３３は相互に通信が可能であると共に、通信部３１ｂは通信部１４の間でも通信が可能である。 The imaging device 20 can also acquire an inference model from the external device 30 via the communication unit 14. The external device 30 includes a learning unit 31 and an external image database (DB) 32. The learning unit 31 has a communication unit 31b, and the external image DB 32 has a communication unit 33. The communication units 31 b and 33 can communicate with each other, and the communication unit 31 b can also communicate with the communication unit 14.

学習部３１は母集合作成部３１ａ、出力設定部３１ｃ及び入出力モデル化部３１ｄを有しており、外部画像ＤＢ３２は、画像分類機能部３４を備えている。画像分類機能部３４は、複数の画像を画像中に含まれる対象物の種類毎に分類して記録する。図１の例では、画像分類機能部３４は、対象物第１種類の画像群と対象物第２種類の画像群とを記録する例を示しているが、分類する種類の数は適宜設定可能である。 The learning unit 31 includes a population creation unit 31a, an output setting unit 31c, and an input / output modeling unit 31d, and the external image DB 32 includes an image classification function unit 34. The image classification function unit 34 classifies and records a plurality of images for each type of object included in the image. In the example of FIG. 1, the image classification function unit 34 shows an example of recording the first-type image group of the object and the second-type image group of the object, but the number of types to be classified can be set as appropriate. It is.

母集合作成部３１ａは、外部画像ＤＢ３２から画像を読出して、学習データの元となる母集合を作成する。出力設定部３１ｃは、母集合の画像に対する出力を設定する。例えば、図１の装置を撮像画像のピント合わせの対象となる対象物の検出に利用することが考えられる。例えば、撮像画像中の人間の目にピントを合わせる場合には、推論によって目の画像部分を検出するのである。この場合には、母集合作成部３１ａは、母集合として目の画像を用い、出力設定部３１ｃは、当該画像が目であることを示す情報と共に当該画像の撮影時に用いるパラメータや、ピント位置を設定する。 The mother set creation unit 31a reads an image from the external image DB 32 and creates a mother set as a basis of learning data. The output setting unit 31c sets an output for a population image. For example, it is conceivable to use the apparatus shown in FIG. 1 for detecting an object that is a target for focusing a captured image. For example, when focusing on the human eye in the captured image, the image portion of the eye is detected by inference. In this case, the population creation unit 31a uses the image of the eye as the population, and the output setting unit 31c sets the parameters used at the time of capturing the image and the focus position together with information indicating that the image is an eye. Set.

入出力モデル化部３１ｄは、例えば図２に示す手法によって、母集合作成部３１ａが作成した画像の母集合と出力設定部３１ｃが設定した出力との関係を学習した学習モデル（推論モデル）を生成する。学習部３１は、画像取得装置１０の制御部１１から要求があった場合には、生成した推論モデルを通信部３１ｂ，１４を介して画像取得装置１０に送信するようになっている。制御部１１は、通信部１４を介して取得した推論モデルを推論エンジン１２の記憶部１２ａに辞書として記憶させることができるようになっている。 The input / output modeling unit 31d uses, for example, a learning model (inference model) in which the relationship between the image set created by the set creation unit 31a and the output set by the output setting unit 31c is learned by the method shown in FIG. Generate. The learning unit 31 is configured to transmit the generated inference model to the image acquisition device 10 via the communication units 31 b and 14 when requested by the control unit 11 of the image acquisition device 10. The control unit 11 can store the inference model acquired via the communication unit 14 in the storage unit 12a of the inference engine 12 as a dictionary.

次に、このように構成された実施の形態の動作について図３から図６を参照して説明する。図３及び図４は第１の実施の形態の動作を説明するためのフローチャートであり、図３は画像取得装置１０の動作を示し、図４は外部機器３０の動作を示している。また、図５及び図６は第１の実施の形態の動作を説明するための説明図である。 Next, the operation of the embodiment configured as described above will be described with reference to FIGS. 3 and 4 are flowcharts for explaining the operation of the first embodiment. FIG. 3 shows the operation of the image acquisition apparatus 10 and FIG. 4 shows the operation of the external device 30. 5 and 6 are explanatory diagrams for explaining the operation of the first embodiment.

図３から図６は図１の撮像装置２０により工業用内視鏡を構成した場合における動作を説明するためのものである。例えば、図１の撮像部２２を工業用内視鏡の挿入部の先端部２３ａに収納して撮像装置２０を構成するものとする。なお、撮像部２２は、先端部２３ａの先端側を撮像することができるものとする。図５はこのような先端部２３ａの移動の様子を所定の時間間隔毎の先端部２３ａの位置の変化により示しており、先端部２３ａより基端側の挿入部は図示していない。図５の例では、先端部２３ａは、矢印にて示すように、パイプ４１の入口側から進入し、パイプ４１の管腔の深部４３の方向に進行することを示している。図５中の画像Ｐ１，Ｐ２，…は、先端部２３ａの移動に伴って、撮像部２２によって撮像されて順次得られた画像を示している。 3 to 6 are for explaining the operation when an industrial endoscope is constituted by the imaging apparatus 20 of FIG. For example, it is assumed that the imaging device 20 is configured by housing the imaging unit 22 of FIG. 1 in the distal end portion 23a of the insertion portion of the industrial endoscope. In addition, the imaging part 22 shall be able to image the front end side of the front-end | tip part 23a. FIG. 5 shows such a movement of the distal end portion 23a by a change in the position of the distal end portion 23a at every predetermined time interval, and an insertion portion on the proximal end side from the distal end portion 23a is not shown. In the example of FIG. 5, the distal end portion 23 a enters from the inlet side of the pipe 41 and progresses in the direction of the deep portion 43 of the lumen of the pipe 41 as indicated by an arrow. Image P1, P2,... In FIG. 5 indicates images sequentially obtained by being imaged by the imaging unit 22 in accordance with the movement of the distal end portion 23a.

画像Ｐ１〜Ｐ３が得られるタイミングでは、先端部２３ａは略深部４３の方向を向いており、画像Ｐ１〜Ｐ３の略中央に管腔の深部４３の画像４３ａが含まれる。パイプ４１の内壁には凸部４２が形成されており、先端部２３ａがこの凸部４２に近づくことで、画像Ｐ４中に視認可能な凸部４２の画像４２ａが撮像される。先端部２３ａが凸部４２に一層近づくことで、画像Ｐ５に示すように、画像４２ａは大きなサイズで撮像される。更に、先端部２３ａが深部４３側に進行すると、画像Ｐ６には画像４２ａは含まれなくなる。 At the timing at which the images P1 to P3 are obtained, the distal end portion 23a faces the direction of the deep part 43, and the image 43a of the deep part 43 of the lumen is included in the approximate center of the images P1 to P3. A convex portion 42 is formed on the inner wall of the pipe 41, and the image 42a of the convex portion 42 that is visible in the image P4 is captured by the tip portion 23a approaching the convex portion 42. As the distal end portion 23a gets closer to the convex portion 42, the image 42a is captured in a large size as shown in the image P5. Further, when the distal end portion 23a advances to the deep portion 43 side, the image 42a is not included in the image P6.

画像Ｐ１〜Ｐ６において、深部４３の画像４３ａは、常に画像の略中央に位置しており、先端部２３ａは、深部４３方向に向かって進行していることが分かる。推論エンジン１２の記憶部１２ａには画像Ｐ１〜Ｐ６に類似した画像及びその変化についての学習の結果得られた推論モデルが辞書として記憶されているものとする。即ち、推論エンジン１２は、挿入部が正しく挿入された場合の撮像画像の変化を推論することが可能である。 In the images P 1 to P 6, it can be seen that the image 43 a of the deep portion 43 is always located at substantially the center of the image, and the distal end portion 23 a proceeds toward the deep portion 43. It is assumed that the storage unit 12a of the inference engine 12 stores an image similar to the images P1 to P6 and an inference model obtained as a result of learning about changes thereof as a dictionary. That is, the inference engine 12 can infer changes in the captured image when the insertion unit is correctly inserted.

図３は内視鏡挿入部の挿入時において、挿入が正しく行われているか否かを推論により判定する例を示している。挿入部の挿入時には先端部２３ａに収納された撮像部２２によって、撮像を行う。撮像装置２０の制御部１１は、図３のステップＳ１において、撮像部２２が撮像して得た画像を取り込む。制御部１１の表示制御部１１ｆは撮像画像を表示部１５に与えて表示させる。また、制御部１１は、撮像画像を推論エンジン１２に与えて、正しく挿入されているか否かの推論を行わせる。 FIG. 3 shows an example in which whether or not the insertion is correctly performed is determined by inference when the endoscope insertion portion is inserted. When the insertion portion is inserted, the imaging is performed by the imaging portion 22 housed in the distal end portion 23a. The control unit 11 of the imaging device 20 captures an image obtained by the imaging unit 22 in step S1 of FIG. The display control unit 11f of the control unit 11 gives the captured image to the display unit 15 for display. In addition, the control unit 11 gives the captured image to the inference engine 12 to infer whether or not the image is correctly inserted.

即ち、制御部１１の記録制御部１１ｃは、ステップＳ２において、撮像画像を記録部１６に与えて仮記録しながら、連続して撮像された２枚の撮像画像を推論エンジン１２に与える。推論エンジン１２は、連続して撮像された画像同士を比較し、変化の有無を検出する（ステップＳ３）。更に、推論エンジン１２は、２つの画像の変化を検出すると、ステップＳ３から処理をステップＳ４に移行して、変化前後の画像変化が正しく挿入された場合の変化であるか否かを推論して推論結果を制御部１１に出力する。なお、推論エンジン１２は、ステップＳ３において、画像変化が検出されない場合には、処理をステップＳ１１に移行して撮影操作があったか否かを判定する。推論エンジン１２は、図５のように挿入が正しく行われている場合には順調な変化であるもの推論する。この場合には、制御部１１は、ステップＳ５から処理をステップＳ１１に移行して、撮影操作があったか否かを判定する。 That is, in step S2, the recording control unit 11c of the control unit 11 provides the captured image to the inference engine 12 while capturing the captured image to the recording unit 16 and temporarily recording the captured image. The inference engine 12 compares the successively captured images and detects whether there is a change (step S3). Further, when the inference engine 12 detects a change in two images, the process proceeds from step S3 to step S4 to infer whether the image change before and after the change is a change when it is correctly inserted. The inference result is output to the control unit 11. Note that if no image change is detected in step S3, the inference engine 12 proceeds to step S11 and determines whether or not there has been a shooting operation. The inference engine 12 infers what is a smooth change when the insertion is performed correctly as shown in FIG. In this case, the control unit 11 shifts the processing from step S5 to step S11, and determines whether or not there is a photographing operation.

ここで、挿入部の挿入が図６に示すものであるものとする。図６は図５と同様の記載方法により挿入部が正しく挿入されていないと推論される場合の挿入の様子を示している。図６の例では、画像Ｐ１１〜Ｐ１４が撮像される時点までは、先端部２３ａは深部４３の方向に向かって進行している。画像Ｐ１５が撮像される時点では、先端部２３ａは、向きが深部４３の方向からずれてパイプ４１の内壁に向かう方向に進行する。この結果、深部４３の画像４３ａは画像中央からずれ、画像Ｐ１６が撮像される時点では、先端部２３ａとパイプ４１の内壁と衝突が予想される程度に大幅に深部４３の画像４３ａが画像中央からずれている。また、画像Ｐ１６中には、先端部２３ａとパイプ４１の内壁との衝突の予測位置を示す表示４４も表示されている。 Here, the insertion of the insertion portion is assumed to be as shown in FIG. FIG. 6 shows a state of insertion when it is inferred that the insertion portion is not correctly inserted by the same description method as FIG. In the example of FIG. 6, the distal end portion 23 a proceeds toward the deep portion 43 until the time when the images P 11 to P 14 are captured. At the time when the image P15 is captured, the tip 23a moves in a direction toward the inner wall of the pipe 41 with the orientation deviating from the direction of the deep portion 43. As a result, the image 43a of the deep portion 43 is shifted from the center of the image, and at the time when the image P16 is captured, the image 43a of the deep portion 43 is greatly deviated from the center of the image to the extent that a collision between the distal end portion 23a and the inner wall of the pipe 41 is expected. It's off. In addition, in the image P16, a display 44 indicating the predicted position of the collision between the distal end portion 23a and the inner wall of the pipe 41 is also displayed.

図６に示す挿入が行われると、推論エンジン１２は、ステップＳ４において、撮像画像に順調でない変化があったことを示す推論結果を制御部１１に出力する。制御部１１は、ステップＳ５において、撮像画像の変化が順調な変化ではないものと判定して、処理をステップＳ６に移行する。ステップＳ６において、表示制御部１１ｆは、挿入が正しく行われていないことを示す警告表示を表示部１５の表示画面上に表示させる。なお、制御部１１は、警告音を発生させるようにしてもよい。次に、制御部１１は、ステップＳ７において、動作が継続されているか否かを判定する。例えば、制御部１１は撮像画像に対する画像解析によって、先端部２３ａの移動が継続されているか否かを判定することができる。制御部１１は、例えば先端部２３ａの進行が停止した場合等においては、動作が継続していないものと判定して処理をステップＳ１に戻す。 When the insertion shown in FIG. 6 is performed, the inference engine 12 outputs an inference result indicating that the captured image has changed unsuccessfully to the control unit 11 in step S4. In step S5, the control unit 11 determines that the change in the captured image is not a smooth change, and the process proceeds to step S6. In step S6, the display control unit 11f displays on the display screen of the display unit 15 a warning display indicating that the insertion has not been performed correctly. Note that the control unit 11 may generate a warning sound. Next, the control part 11 determines whether operation | movement is continued in step S7. For example, the control unit 11 can determine whether or not the movement of the distal end portion 23a is continued by image analysis on the captured image. For example, when the progress of the distal end portion 23a is stopped, the control unit 11 determines that the operation is not continued and returns the process to step S1.

一方、制御部１１は、動作が継続していると判定した場合には、ステップＳ８において、警告無視であると判定して、自動的に撮影を行い、エビデンス記録する。即ち、画像処理部１１ｂによって信号処理された撮像画像は、記録制御部１１ｃによって記録部１６の画像データ記録領域１６ａに記録される。また、設定制御部１１ｄは、推論エンジン１２による推論が無視されて利用されなかったことを示す推論モデル使用情報を生成して記録制御部１１ｃに与える。これにより、記録制御部１１ｃは、画像データ記録領域１６ａに記録する撮像画像のメタデータとして、推論モデル使用情報を記録する。記録された撮像画像と推論モデル使用情報とにより、先端部２３ａが正しく挿入されていないという推論結果を無視して先端部２３ａの挿入が継続されたこと及びそのときの内視鏡画像が明らかとなる。 On the other hand, if it is determined that the operation is continued, the control unit 11 determines in step S8 that the warning is ignored, automatically shoots, and records evidence. That is, the captured image signal-processed by the image processing unit 11b is recorded in the image data recording area 16a of the recording unit 16 by the recording control unit 11c. In addition, the setting control unit 11d generates inference model use information indicating that the inference by the inference engine 12 is ignored and is not used, and supplies the inference model use information to the recording control unit 11c. Thereby, the recording control unit 11c records the inference model usage information as metadata of the captured image to be recorded in the image data recording area 16a. The recorded image and inference model use information indicate that the insertion of the distal end portion 23a is continued ignoring the inference result that the distal end portion 23a is not correctly inserted, and the endoscopic image at that time is clear. Become.

制御部１１は、次のステップＳ９において、ステップＳ８の撮影及び記録が所定回数繰り返されたか否かを判定する。制御部１１は、所定回数に到達していない場合には処理をステップＳ１１に移行する。また、制御部１１は、ステップＳ８の撮影及び記録が所定回数に到達した場合には、警告の提示方法に問題があったものと判定し、次のステップＳ１０において、警告方法の変更を行う。例えば、警告表示のサイズを大きくしたり、色を変化させたり、タイミングを変えたり、警告表示だけでなく警告音を発生する或いは音量を変える等の警告提示方法の変更が行われる。 In the next step S9, the control unit 11 determines whether or not the shooting and recording in step S8 has been repeated a predetermined number of times. The control part 11 transfers a process to step S11, when not having reached the predetermined number of times. Further, when the shooting and recording in step S8 has reached the predetermined number of times, the control unit 11 determines that there is a problem with the warning presentation method, and changes the warning method in the next step S10. For example, the warning presentation method is changed, such as increasing the size of the warning display, changing the color, changing the timing, generating a warning sound or changing the volume as well as the warning display.

制御部１１は、次のステップＳ１１において、撮影操作が行われたか否かを判定する。制御部１１は、撮影操作が行われていない場合には処理をステップＳ１に戻し、撮影操作が行われた場合にはステップＳ１２において、撮影及び記録を行う。 In step S11, the control unit 11 determines whether or not a shooting operation has been performed. The control unit 11 returns the process to step S1 when the photographing operation is not performed, and performs photographing and recording at step S12 when the photographing operation is performed.

図４は上述した推論モデルの作成方法を説明するものである。外部機器３０の外部画像ＤＢ３２には、内視鏡挿入時の画像が記憶されている。学習部３１の母集合作成部３１ａは、ステップＳ２１において、挿入時画像変化を母集合とする。入出力モデル化部３１ｄは、ステップＳ２２において、挿入良好（ＯＫ）時の画像変化を教師データ化し、ステップＳ２３において、挿入不良（ＮＧ）時の画像変化を教師データ化して、推論モデルを生成する（ステップＳ２４）。学習部３１は、依頼データがある場合には、当該依頼データを用いた推論を行う（ステップＳ２５）。入出力モデル化部３１ｄは、ステップＳ２５における推論の信頼性が所定値以上であるか否かを判定する（ステップＳ２６）。 FIG. 4 illustrates a method for creating the inference model described above. The external image DB 32 of the external device 30 stores an image when the endoscope is inserted. In step S21, the population creation unit 31a of the learning unit 31 sets the image change upon insertion as the population. In step S22, the input / output modeling unit 31d converts the image change at the time of good insertion (OK) into teacher data, and at step S23, converts the image change at the time of poor insertion (NG) into teacher data to generate an inference model. (Step S24). If there is request data, the learning unit 31 performs inference using the request data (step S25). The input / output modeling unit 31d determines whether or not the inference reliability in step S25 is equal to or greater than a predetermined value (step S26).

入出力モデル化部３１ｄは、信頼性が所定値以上でない場合には、処理をステップＳ２７に移行して、学習母体の再設定等を行った後、処理をステップＳ２４に移行して推論モデルの生成を行う。入出力モデル化部３１ｄは、信頼性が所定値以上になった場合には、処理をステップＳ２８に移行して、生成した推論モデルを通信部３１ｂを介して画像取得装置１０に送信する。こうして、画像取得装置１０の推論エンジン１２には、正しく挿入が行われたか否かを判定するための推論モデルが格納される。 If the reliability is not equal to or higher than the predetermined value, the input / output modeling unit 31d moves the process to step S27, resets the learning matrix, etc., and then moves the process to step S24 to change the inference model. Generate. If the reliability becomes equal to or higher than the predetermined value, the input / output modeling unit 31d moves the process to step S28 and transmits the generated inference model to the image acquisition device 10 via the communication unit 31b. Thus, the inference model for determining whether or not the insertion is correctly performed is stored in the inference engine 12 of the image acquisition apparatus 10.

なお、推論エンジン１２には複数の推論モデル（辞書）を記憶させることができ、推論エンジン１２は、挿入部の挿入対象物毎の推論モデルを備えている。設定制御部１１ｄは、挿入対象物毎に使用する推論モデルを変更可能である。 The inference engine 12 can store a plurality of inference models (dictionaries), and the inference engine 12 includes an inference model for each insertion object of the insertion unit. The setting control unit 11d can change the inference model used for each insertion object.

このように本実施の形態においては、推論モデルを用いた推論を行うと共に、推論結果を採用したか否か、採用した場合にはいずれの推論モデルを採用したかを示す推論モデル使用情報を記録するようになっている。推論モデルによる推論結果は、必ずしも有効であるとは限らない。推論モデル使用情報を記録することによって、推論モデルが有効か否かの境界の判定が容易となり、推論モデルの利用範囲を明確化して推論モデルの有効な理由を促進することができる。また、推論モデルによる推論結果が所定回数以上無視された場合には、推論結果に基づく警告方法が不適切であるものと判定することも可能であり、警告方法等の改善に寄与する。
この実施の形態の場合、推論機能、操作関係データベース１６ｃには、ガイド表示等の推論機能に対して、挿入操作や停止操作等の操作が、推論結果を不採用とする操作であることが登録されている。なお、こような挿入操作や停止操作等は、撮像された画面の変化から判定可能である。さらに、この推論機能に関しては、ユーザが「失敗」というボタンを押すといった操作をデータベースに含めるようにしてもよい。 As described above, in the present embodiment, inference using an inference model is performed, and inference model use information indicating whether or not an inference result is adopted and which inference model is adopted is recorded. It is supposed to be. The inference result by the inference model is not always effective. By recording the inference model usage information, it is easy to determine the boundary of whether the inference model is valid, and the reason for using the inference model can be promoted by clarifying the use range of the inference model. In addition, when the inference result by the inference model is ignored for a predetermined number of times, it is possible to determine that the warning method based on the inference result is inappropriate, which contributes to the improvement of the warning method and the like.
In the case of this embodiment, it is registered in the inference function / operation relation database 16c that operations such as an insertion operation and a stop operation are operations that do not adopt the inference result for an inference function such as a guide display. Has been. Such an insertion operation, a stop operation, and the like can be determined from a change in the captured screen. Furthermore, regarding this inference function, an operation in which the user presses the “failure” button may be included in the database.

（第２の実施の形態）
図７及び図８は本発明の第２の実施の形態において採用される動作フローを示すフローチャートである。本実施の形態のハードウェア構成は図１と同様である。本実施の形態は図１の撮像装置２０によりデジタルカメラを構成した例を示している。 (Second Embodiment)
7 and 8 are flowcharts showing an operation flow employed in the second embodiment of the present invention. The hardware configuration of this embodiment is the same as that shown in FIG. This embodiment shows an example in which a digital camera is configured by the imaging device 20 of FIG.

図９は図１の撮像装置２０により被写体を撮像する様子を示す説明図である。図１の撮像装置２０の各部は、図９の筐体２０ａ内に収納されている。筐体２０ａの背面には表示部１５を構成する表示画面１５ａが配設されている。また、筐体２０ａの前面には、光学系２２ｂを構成する図示しないレンズが配設されており、筐体２０ａの上面には、操作部１３を構成するシャッタボタン１３ａが配設されている。 FIG. 9 is an explanatory diagram showing a state in which a subject is imaged by the imaging device 20 of FIG. Each part of the imaging device 20 in FIG. 1 is housed in a housing 20a in FIG. A display screen 15a constituting the display unit 15 is disposed on the back surface of the housing 20a. Further, a lens (not shown) constituting the optical system 22b is disposed on the front surface of the housing 20a, and a shutter button 13a constituting the operation unit 13 is disposed on the upper surface of the housing 20a.

図９は被写体として、植物５５に止まった蛇目蝶（蝶々）５６を撮影する例を示しており、ユーザ５１は、例えば、筐体２０ａを右手５２で把持して、表示部１５の表示画面１５ａを見ながら、蝶々５６を視野範囲に捉えた状態で、右手の指５２ａでシャッタボタン１３ａを押下操作することで撮影を行う。なお、蝶々５６は、羽根５７に人間の目に似た模様を有する。 FIG. 9 shows an example in which a snake butterfly (butterfly) 56 stopped on a plant 55 is photographed as a subject. For example, the user 51 holds the casing 20a with the right hand 52 and displays the display screen 15a of the display unit 15. While viewing the butterfly 56 within the field of view, shooting is performed by pressing the shutter button 13a with the finger 52a of the right hand. The butterfly 56 has a pattern resembling human eyes on the wing 57.

本実施の形態においては、推論モデルを用いて、フォーカス制御に用いる対象物の判定を行う。即ち、推論エンジン１２には、フォーカス制御の対象物を検出するための推論モデルが記憶されている。例えば、推論エンジン１２には、人間の目を検出するための推論モデル（以下、人間辞書という）が記憶されているものとする。 In the present embodiment, an object used for focus control is determined using an inference model. That is, the inference engine 12 stores an inference model for detecting an object for focus control. For example, it is assumed that an inference model (hereinafter referred to as a human dictionary) for detecting human eyes is stored in the inference engine 12.

制御部１１は、図７のステップＳ３１において、画像の取得モードが指定されているか否かを判定する。画像の取得モードが指定されている場合には、制御部１１は、ステップＳ３２において画像入力及び表示を行う。即ち、撮像部２２は被写体を撮像し、制御部１１は、撮像部２２からの撮像画像を取り込み、撮像画像をスルー画として表示部１５に与えて表示させる。 The controller 11 determines whether or not an image acquisition mode is designated in step S31 of FIG. When the image acquisition mode is designated, the control unit 11 performs image input and display in step S32. That is, the imaging unit 22 images a subject, and the control unit 11 captures a captured image from the imaging unit 22 and gives the captured image as a through image to the display unit 15 for display.

本実施の形態においては、設定制御部１１ｄは、推論エンジン１２にフォーカス制御の対象物の検出のための推論を実行させる。推論エンジン１２は、記憶部１２ａに記憶されている推論モデル（人間辞書）を用いて、撮像画像中からフォーカス制御の対象物である人間の目の画像部分を検出する。推論エンジン１２は、推論の結果を制御部１１に出力する。 In the present embodiment, the setting control unit 11d causes the inference engine 12 to perform inference for detecting an object for focus control. The inference engine 12 uses the inference model (human dictionary) stored in the storage unit 12a to detect the image portion of the human eye that is the target of focus control from the captured image. The inference engine 12 outputs the inference result to the control unit 11.

図１０は表示部１５の表示画面１５ａに表示される撮像画像を示す説明図である。上述したように、ユーザ５１は、植物５５上の蝶々５６の撮影を試みようとしている。画像Ｐ２１は、ある一瞬におけるスルー画を示している。表示画面１５ａ上に表示されたスルー画には、蝶々５６の画像６１が表示されている。この時点では、推論エンジン１２からは目の画像として信頼性が低いことを示す推論結果が制御部１１に与えられている。制御部１１は、ステップＳ３３において推論が行われたか否かを判定しており、推論結果が得られると、ステップＳ３４において信頼性が所定の閾値よりも高い推論結果が得られたか否かを判定する。この場合には、信頼性が低いので、制御部１１は、処理をステップＳ３９に移行して、撮影操作が行われたか否かを判定する。制御部１１は、撮影操作が行われていない場合には、処理をステップＳ３１に戻す。 FIG. 10 is an explanatory diagram showing a captured image displayed on the display screen 15 a of the display unit 15. As described above, the user 51 tries to photograph the butterfly 56 on the plant 55. The image P21 shows a through image at a certain moment. An image 61 of butterflies 56 is displayed on the through image displayed on the display screen 15a. At this time, an inference result indicating that the reliability is low as an eye image is given to the control unit 11 from the inference engine 12. The control unit 11 determines whether or not inference has been performed in step S33, and when an inference result is obtained, in step S34, it is determined whether or not an inference result whose reliability is higher than a predetermined threshold is obtained. To do. In this case, since the reliability is low, the control unit 11 shifts the process to step S39 and determines whether or not a photographing operation has been performed. If the photographing operation is not performed, the control unit 11 returns the process to step S31.

次に、図１０の画像Ｐ２２がスルー画として表示画面１５ａに表示されるものとする。画像Ｐ２２には、羽根を広げた状態の蝶々５６の画像６２が含まれる。画像６２中には、人間の目に類似した画像部分が含まれており、推論エンジン１２は、この羽根の模様を人間の目の画像部分と推論する。この結果、推論エンジン１２は、信頼性が所定の閾値よりも高い推論結果を制御部１１に出力する。 Next, it is assumed that the image P22 of FIG. 10 is displayed on the display screen 15a as a through image. The image P22 includes an image 62 of the butterfly 56 with the wings spread. The image 62 includes an image portion similar to the human eye, and the inference engine 12 infers the blade pattern as the human eye image portion. As a result, the inference engine 12 outputs an inference result whose reliability is higher than a predetermined threshold value to the control unit 11.

そうすると、制御部１１は、ステップＳ３４において、推論エンジン１２の推論結果の信頼性が高いものと判定して、処理をステップＳ３５に移行する。ステップＳ３５において、設定制御部１１ｄは、表示制御部１１ｆを制御して、推論結果を表示させる。表示制御部１１ｆは、推論結果として、目の画像部分を検出した位置を示す枠画像６４を表示する（図１０の画像Ｐ２３）。枠画像６４は、フォーカス制御におけるピント位置（ＡＦポイント）を示すものである。 If it does so, the control part 11 will determine with the reliability of the inference result of the inference engine 12 being high in step S34, and will transfer a process to step S35. In step S35, the setting control unit 11d controls the display control unit 11f to display the inference result. The display control unit 11f displays a frame image 64 indicating the position where the eye image portion is detected as an inference result (image P23 in FIG. 10). The frame image 64 shows a focus position (AF point) in focus control.

オートフォーカスが設定されている場合には、設定制御部１１ｄは、推論によって検出された目の画像部分をピント位置とする情報を撮像制御部１１ａに与える。撮像制御部１１ａは、指定されたピント位置において合焦状態となるように、光学系２２ｂを駆動制御する。これにより、推論によって検出された目の画像部分にピントが合った撮像画像が得られる。 When the autofocus is set, the setting control unit 11d gives the imaging control unit 11a information that sets the image portion of the eye detected by inference as the focus position. The imaging control unit 11a drives and controls the optical system 22b so that the in-focus state is obtained at the designated focus position. Thereby, a captured image in which the image portion of the eye detected by inference is in focus is obtained.

図１０の画像Ｐ２３はこの場合に表示画面１５ａに表示されるスルー画を示しており、ピントが合った画像部分６３ａとピントが合っていない（破線部）画像部分６３ｂとを有する。人間の目を検出するための人間（人物判定、または顔検出）辞書を用いたことにより、画像Ｐ２３は、蛇目蝶５６の羽根の模様部分にフォーカスが合った画像となっており、ユーザが希望するフォーカス状態になっていないものと考えられる。 An image P23 in FIG. 10 shows a through image displayed on the display screen 15a in this case, and has an image portion 63a that is in focus and an image portion 63b that is not in focus (broken line). By using a human (person determination or face detection) dictionary for detecting human eyes, the image P23 is an image in which the pattern portion of the wings of the snake-eye butterfly 56 is focused, and the user desires It is thought that it is not in the focused state.

そこで、ユーザ５１が例えば、ダイヤル操作など、操作部１３によりフォーカスを変更する操作を行うものとする。制御部１１は、ステップＳ３６において、ユーザのフォーカス変更操作の有無を判定しており、フォーカス変更操作を検出すると、処理をステップＳ３９に移行する。制御部１１は、ステップＳ３９において撮影操作が行われたか否かを判定し、行われていない場合には、処理をステップＳ３１に戻し、撮影操作が行われると、ステップＳ４０に処理を移行して撮影を行う。ステップＳ４０は、推論モデル（人間（人物判定、または顔検出）辞書）による推論結果に基づくフォーカス制御を採用せず、ユーザが行ったフォーカス変更操作によるフォーカス制御が行われて撮影が行われた場合に実行される。制御部１１は、ステップＳ４０において、撮影を行い、推論の使用に関するエビデンスを記録する。即ち、画像処理部１１ｂによって信号処理された撮像画像は、記録制御部１１ｃによって記録部１６の画像データ記録領域１６ａに記録される。また、設定制御部１１ｄは、推論エンジン１２による推論が利用されなかったことを示す推論モデル使用情報を生成して記録制御部１１ｃに与える。これにより、記録制御部１１ｃは、画像データ記録領域１６ａに記録する撮像画像のメタデータとして、推論モデル使用情報を記録する。記録された撮像画像と推論モデル使用情報とにより、推論エンジン１２に記憶された推論モデルを利用した推論を採用せず、ユーザが独自に行ったフォーカス変更操作に従って撮影が行われたことが明らかとなる。
ただし、推論モデルの利用によって自動で設定された機能項目に対して、同じ、または同様の機能項目に対応する操作部の操作の判定が重要であって、推論モデルの出力と無関係の操作が行われたとしても、それは、推論結果に対する不満ではないので、推論モデル使用情報に使用しなかったという履歴を反映させない。つまり、この画像取得装置は、上記操作部の操作結果と上記推論結果の提示内容に従って、上記推論結果が採用されたか否かを判定する判定部が正しく判定するには、この判定部は、上記推論結果の提示内容に関連する項目に対して上記操作部の操作が影響するか否かに応じた判定を行うことが重要で、推論部の関与がどの範囲であるかを考慮して、上記推論結果が採用されたか否かを判定する。例えば、フォーカス結果（ピントが正しく合ったかどうかや、ピントを合わせた位置の表示がユーザの思惑通りだったとか、ピント位置や表示結果など）に対して、フォーカス操作を行っている場合には、推論結果の不採用判定と考えることができる。このような判定を行うには、その推論がどのような制御に繋がり、それがどのような操作部と関連しているかがわかる推論機能、操作関係データベースを記録部に設ければよい。例えば、顔検出は、ピント合わせに使われることから、それによって制御する機能はピント合わせとなり、フォーカスリングなどが、その機能に対応する操作部ということになる。また、後述のフォーカスする部位の表示機能については、例えばタッチパネルによるフォーカス切替なども対応操作としてもよい。 Therefore, it is assumed that the user 51 performs an operation of changing the focus by the operation unit 13 such as a dial operation. In step S36, the control unit 11 determines whether or not the user has performed a focus change operation. When the control unit 11 detects the focus change operation, the process proceeds to step S39. The control unit 11 determines whether or not a shooting operation has been performed in step S39. If the shooting operation has not been performed, the process returns to step S31. If the shooting operation has been performed, the process proceeds to step S40. Take a picture. Step S40 does not adopt the focus control based on the inference result by the inference model (human (person determination or face detection) dictionary), and the focus control by the focus change operation performed by the user is performed and the photographing is performed. To be executed. In step S 40, the control unit 11 performs shooting and records evidence regarding the use of inference. That is, the captured image signal-processed by the image processing unit 11b is recorded in the image data recording area 16a of the recording unit 16 by the recording control unit 11c. Further, the setting control unit 11d generates inference model use information indicating that the inference by the inference engine 12 has not been used, and provides the inference model use information to the recording control unit 11c. Thereby, the recording control unit 11c records the inference model usage information as metadata of the captured image to be recorded in the image data recording area 16a. Based on the recorded captured image and the inference model usage information, it is clear that shooting was performed in accordance with the focus change operation independently performed by the user without using the inference using the inference model stored in the inference engine 12. Become.
However, for function items that are automatically set by using an inference model, it is important to determine the operation of the operation unit that corresponds to the same or similar function item, and operations that are not related to the output of the inference model are performed. Even if it is, it is not dissatisfaction with the inference result, and therefore does not reflect the history of not using the inference model usage information. That is, in order to correctly determine the determination unit that determines whether or not the inference result is adopted according to the operation result of the operation unit and the content of the inference result, the determination unit is It is important to make a decision according to whether the operation of the operation unit affects the items related to the content of the inference result, considering the range of involvement of the inference unit It is determined whether or not the inference result is adopted. For example, if the focus operation is being performed on the focus result (whether the focus is correct, the display of the focused position is as expected by the user, the focus position, the display result, etc.) This can be considered as a decision to reject the inference result. In order to make such a determination, an inference function and an operation relation database that can understand what kind of control the inference is related to and what kind of operation unit it is associated with may be provided in the recording unit. For example, since face detection is used for focusing, the function controlled thereby is focusing, and the focus ring or the like is an operation unit corresponding to the function. For the display function of the part to be focused which will be described later, for example, focus switching using a touch panel may be used as a corresponding operation.

また、ユーザは、人間辞書を用いたフォーカス制御に満足せず、他の辞書（推論モデル）を用いたフォーカス制御を希望するものとする。この場合には、ユーザは操作部１３を操作して取得モードを終了し、辞書に関するメニューを表示させる。この操作が行われると、制御部１１は、表示制御部１１ｆを制御して辞書メニューを表示部１５の表示画面１５ａに表示させる。 Further, it is assumed that the user is not satisfied with the focus control using the human dictionary and desires the focus control using another dictionary (inference model). In this case, the user operates the operation unit 13 to end the acquisition mode and display a menu related to the dictionary. When this operation is performed, the control unit 11 controls the display control unit 11f to display a dictionary menu on the display screen 15a of the display unit 15.

図１１Ａ及び図１１Ｂは辞書メニュー画面７０を示す説明図である。辞書メニュー画面７０上には、推論エンジン１２に登録されている推論モデル（辞書）が表示されている。図１１Ａの例では、「人間の辞書」と表記したアイコン７１により人間辞書のみが登録されていることが示されている。ユーザ５１は蝶々５６の撮影に適した辞書を追加するために、追加ボタン７２に対する操作を行う。例えば、表示画面１５ａ上に図示しないタッチパネルが配設されている場合には、ユーザ５１は、追加ボタン７２に対するタッチ操作を行う。 11A and 11B are explanatory diagrams showing the dictionary menu screen 70. FIG. On the dictionary menu screen 70, an inference model (dictionary) registered in the inference engine 12 is displayed. In the example of FIG. 11A, it is indicated that only a human dictionary is registered by an icon 71 described as “human dictionary”. The user 51 operates the add button 72 in order to add a dictionary suitable for shooting the butterfly 56. For example, when a touch panel (not shown) is provided on the display screen 15a, the user 51 performs a touch operation on the add button 72.

制御部１１は、ステップＳ４１において辞書の転送が依頼されているか否かを判定しており、追加ボタン７２が操作されると、ユーザが新たな辞書の転送を希望しているものと判定して、処理をステップＳ４２に移行する。ステップＳ４２において、表示制御部１１ｆは、対象物の設定画面及び再学習物の設定画面を表示させて、ユーザによる対象物の指定及び再学習物の指定を可能にする。制御部１１は、ユーザによって指定された対象物又は再学習物に対する学習依頼又は再学習依頼を、外部機器３０に対して行う。 The control unit 11 determines whether or not a dictionary transfer is requested in step S41. When the add button 72 is operated, the control unit 11 determines that the user desires to transfer a new dictionary. Then, the process proceeds to step S42. In step S42, the display control unit 11f displays a target setting screen and a re-learning object setting screen to allow the user to specify the target and the re-learning target. The control unit 11 makes a learning request or a re-learning request for the target object or the re-learning object specified by the user to the external device 30.

図８は外部機器３０における推論モデルの作成処理を示している。図８において図４と同一の手順については同一符号を付して説明を省略する。外部機器３０の学習部３１は、ステップＳ５１において、通信部１４，３１ｂを介して、画像取得装置１０の制御部１１から学習依頼又は再学習依頼を受けたか否かを判定する。学習依頼又は再学習依頼を受けると、学習部３１は、ステップＳ５２において依頼内容に含まれる対象物を設定する。例えば、制御部１１からユーザ５１の操作に基づいて蝶々の辞書の学習依頼があったものとする。この場合には、母集合作成部３１ａは、ステップＳ５３において、対象物画像である蝶々の画像とピント位置とを教師データ化する。また、母集合作成部３１ａは、ステップＳ５４において、対象物画像でない画像を対象物画像とは別に教師データ化する。 FIG. 8 shows an inference model creation process in the external device 30. In FIG. 8, the same steps as those in FIG. In step S51, the learning unit 31 of the external device 30 determines whether a learning request or a re-learning request has been received from the control unit 11 of the image acquisition device 10 via the communication units 14 and 31b. When receiving the learning request or the re-learning request, the learning unit 31 sets an object included in the request content in step S52. For example, it is assumed that there is a learning request for the butterfly dictionary based on the operation of the user 51 from the control unit 11. In this case, in step S53, the population creation unit 31a converts the butterfly image, which is the object image, and the focus position into teacher data. In step S54, the population creating unit 31a converts an image that is not the object image into teacher data separately from the object image.

入出力モデル化部３１ｄは、ステップＳ５３，Ｓ５４において生成された教師データを用いた学習によって、推論モデルを生成する（ステップＳ２４）。学習部３１は、ステップＳ２５において依頼データを用いた推論を行い、推論の信頼性が所定値以上であるか否かを判定する（ステップＳ２６）。 The input / output modeling unit 31d generates an inference model by learning using the teacher data generated in steps S53 and S54 (step S24). The learning unit 31 performs inference using the request data in step S25, and determines whether or not the reliability of the inference is a predetermined value or more (step S26).

入出力モデル化部３１ｄは、信頼性が所定値以上でない場合には、ステップＳ２６から処理をステップＳ５５に移行して、教師データの再設定等を行った後、ステップＳ５６において所定回数以上再設定を行ったか否かを判定する。所定回数以上再設定を行っていない場合には、入出力モデル化部３１ｄはステップＳ２４に処理を戻す。入出力モデル化部３１ｄは、再設定が所定回数以上行われた場合には、ステップＳ５６から処理をステップＳ５７に移行して、対象物の画像は推論には不向きな苦手画像であるものと判定して、苦手画像情報を画像取得装置１０に送信した後、処理をステップＳ２８に移行する。入出力モデル化部３１ｄは、ステップＳ２６において信頼性が所定値以上になったものと判定すると、処理をステップＳ２８に移行する。 If the reliability is not equal to or greater than the predetermined value, the input / output modeling unit 31d moves the process from step S26 to step S55, resets the teacher data, and then resets the predetermined number of times in step S56. It is determined whether or not. If the resetting has not been performed a predetermined number of times or more, the input / output modeling unit 31d returns the process to step S24. When the resetting is performed a predetermined number of times or more, the input / output modeling unit 31d moves the process from step S56 to step S57, and determines that the image of the target object is not suitable for inference. Then, after transmitting poor image information to the image acquisition apparatus 10, the process proceeds to step S28. If the input / output modeling unit 31d determines that the reliability has become equal to or higher than the predetermined value in step S26, the process proceeds to step S28.

こうして、学習部３１は、信頼性が所定値以上になった推論モデル、又は苦手画像情報に対応する推論モデルを通信部３１ｂを介して画像取得装置１０に送信する。画像取得装置１０の制御部１１は、ステップＳ４４において、受信した推論モデルを推論エンジン１２に格納し、苦手画像情報を記録部１６に記録する。 In this way, the learning unit 31 transmits an inference model whose reliability is equal to or higher than a predetermined value or an inference model corresponding to weak image information to the image acquisition device 10 via the communication unit 31b. In step S44, the control unit 11 of the image acquisition apparatus 10 stores the received inference model in the inference engine 12 and records weak image information in the recording unit 16.

図１１Ｂはこの場合に表示画面１５ａ上に表示された辞書メニュー画面７０を示している。図１１Ｂの例では、辞書メニュー画面７０上には、「人間の辞書」と表記したアイコン７１ａにより人間辞書が登録され、「蝶の辞書」と表記したアイコン７３により蝶辞書が登録されていることが示されている。図１１Ｂのアイコン７１ａとアイコン７３とは破線と実線により、相互に異なる表示が行われていることを示し、破線は辞書が選択されていないことを示し、実線は辞書が選択されていることを示している。なお、ユーザがアイコン上をタッチ操作することで、当該アイコンに対応する辞書が選択される。 FIG. 11B shows the dictionary menu screen 70 displayed on the display screen 15a in this case. In the example of FIG. 11B, on the dictionary menu screen 70, the human dictionary is registered by the icon 71a described as “human dictionary”, and the butterfly dictionary is registered by the icon 73 expressed as “butterfly dictionary”. It is shown. The icon 71a and the icon 73 in FIG. 11B indicate that different display is performed by a broken line and a solid line, the broken line indicates that no dictionary is selected, and the solid line indicates that the dictionary is selected. Show. Note that a dictionary corresponding to the icon is selected by the user touching the icon.

図１０の画像Ｐ２４は、取得モードにおいて、蝶辞書を用いてフォーカス制御が行われた場合に表示画面１５ａ上に表示されるスルー画を示している。推論エンジン１２は、蝶辞書を用いることで、画像Ｐ２４中から蝶々５６の画像部分を推論して検出すると共に、推論結果として蝶々の撮像に設定すべきピント位置の情報と蝶々の撮像に設定すべき各種撮影パラメータの情報を制御部１１に与える。撮像制御部１１ａは推論結果に従って撮像部２２を制御する。この結果、表示画面１５ａ上の画像Ｐ２４中には、蝶々５６の全体にピントが合った画像６５が表示されると共に、蝶辞書を用いた推論結果として、検出された蝶に設定されたピント位置を示す枠画像６６が表示される。 An image P24 in FIG. 10 shows a through image displayed on the display screen 15a when focus control is performed using the butterfly dictionary in the acquisition mode. The inference engine 12 uses the butterfly dictionary to infer and detect the image portion of the butterfly 56 from the image P24, and sets the focus position information to be set for butterfly imaging and the butterfly imaging as the inference result. Information on various shooting parameters to be supplied is given to the control unit 11. The imaging control unit 11a controls the imaging unit 22 according to the inference result. As a result, an image 65 in which the entire butterfly 56 is in focus is displayed in the image P24 on the display screen 15a, and the focus position set for the detected butterfly as an inference result using the butterfly dictionary. A frame image 66 is displayed.

ここで、ユーザ５１は自分でフォーカス変更操作をすることなく、シャッタボタン１３ａを押下操作するものとする。即ち、この場合には、制御部１１は、ステップＳ３６の次のステップＳ３７において撮影操作を検出する。制御部１１は、次のステップＳ３８において、撮影を行い、推論の使用に関するエビデンスを記録する。即ち、画像処理部１１ｂによって信号処理された撮像画像は、記録制御部１１ｃによって記録部１６の画像データ記録領域１６ａに記録される。また、設定制御部１１ｄは、推論エンジン１２による推論が利用されたこと、及び利用された辞書が蝶辞書であることを示す辞書ＩＤ等の情報を含む推論モデル使用情報を生成して記録制御部１１ｃに与える。これにより、記録制御部１１ｃは、画像データ記録領域１６ａに記録する撮像画像のメタデータとして、推論モデル使用情報を記録する。記録された撮像画像と推論モデル使用情報とにより、推論エンジン１２に記憶された蝶辞書を利用した推論を採用して撮影が行われたことが明らかとなる。 Here, it is assumed that the user 51 depresses the shutter button 13a without performing the focus change operation by himself. That is, in this case, the control unit 11 detects a photographing operation in step S37 subsequent to step S36. In the next step S 38, the control unit 11 takes a picture and records the evidence regarding the use of inference. That is, the captured image signal-processed by the image processing unit 11b is recorded in the image data recording area 16a of the recording unit 16 by the recording control unit 11c. Further, the setting control unit 11d generates inference model use information including information such as a dictionary ID indicating that the inference by the inference engine 12 is used and that the used dictionary is a butterfly dictionary, and a recording control unit 11c. Thereby, the recording control unit 11c records the inference model usage information as metadata of the captured image to be recorded in the image data recording area 16a. From the recorded captured image and the inference model usage information, it becomes clear that the image was taken using the inference using the butterfly dictionary stored in the inference engine 12.

なお、上記説明では、ユーザが明示的に辞書を追加する例を示したが、ユーザが操作を行わなくても自動的に辞書の追加が可能である。ステップＳ４１において、辞書の転送が依頼されていない場合には、制御部１１は、次のステップＳ４５において、記録されている推論モデル使用情報に基づいて、推論不要の結果が過半数であるか否かを判定する。設定されている辞書では推論結果が採用されない場合の方が推論結果が採用される場合よりも多くなると、制御部１１は、ステップＳ４６において、推論エンジン１２が保持している辞書の全てについて推論不要の結果が過半数であるか否かを判定する。 In the above description, an example in which the user explicitly adds a dictionary has been described, but a dictionary can be automatically added without any operation by the user. If the transfer of the dictionary is not requested in step S41, the control unit 11 determines in step S45 whether the result of inference unnecessary is a majority based on the recorded inference model usage information. Determine. If the inference result is not adopted in the set dictionary when the inference result is adopted more than the case where the inference result is adopted, the control unit 11 does not need to infer all the dictionaries held by the inference engine 12 in step S46. It is determined whether the result is a majority.

制御部１１は、保持辞書の全てについて推論不要の結果が過半数であるという判定を行わない場合には、処理をステップＳ４７に移行して通常辞書を他の辞書に切換えて処理をステップＳ３１に戻す。また、保持辞書の全てが推論不要の結果が過半数である場合には、制御部１１は、推論エンジン１２にはユーザが好んで撮影する被写体のフォーカス制御に適した辞書が保存されていないと判定して、処理をステップＳ４２に移行して、外部機器３０に辞書の作成及び転送を依頼する。なお、この場合には、制御部１１は、ステップＳ４２において依頼する辞書の種類、即ち、辞書により検出する対象物についての指定をユーザに促してもよく、また、ステップＳ４２を省略して、ユーザの対象物の指定操作がない場合でも自動的に対象物を指定して辞書の作成及び転送を依頼するようになっていてもよい。例えば、制御部１１は、撮像画像の画像解析によって、主被写体がいずれの対象物であるかを判定し、この判定結果に基づいて対象物を検出するための辞書の作成及び転送を依頼するようになっていてもよい。 If the controller 11 does not make a determination that the result of inference is not a majority for all of the retained dictionaries, the process proceeds to step S47, the normal dictionary is switched to another dictionary, and the process returns to step S31. . If a majority of the retained dictionaries do not require inference, the controller 11 determines that the inference engine 12 does not store a dictionary suitable for focus control of the subject that the user prefers to shoot. Then, the process proceeds to step S42, and the external device 30 is requested to create and transfer a dictionary. In this case, the control unit 11 may prompt the user to specify the type of dictionary requested in step S42, that is, the object to be detected by the dictionary. Even when there is no operation for specifying the target object, it is possible to automatically specify the target object and request creation and transfer of the dictionary. For example, the control unit 11 determines which target object is the main subject by image analysis of the captured image, and requests creation and transfer of a dictionary for detecting the target object based on the determination result. It may be.

また、制御部１１は、撮像画像の画像解析によって、例えば蝶々の撮影の頻度が高いことを判定した場合には、ステップＳ３６においてフォーカス変更操作が所定回数繰り返されることを判定して、蝶辞書が優先されるように設定を行ってもよい。 In addition, when the control unit 11 determines, for example, that the frequency of shooting the butterfly is high by image analysis of the captured image, the control unit 11 determines that the focus change operation is repeated a predetermined number of times in step S36, and the butterfly dictionary is updated. You may set so that priority may be given.

なお、画像の色合いによって、画像の見え方は著しく異なる。そこで、制御部１１は、撮像画像の画像解析によって、撮像画像の色合いを判定することで、色合いに応じて、使用する推論モデルを切換えるように制御してもよい。この場合に使用する推論モデルについての採用、非採用に関する推論モデル使用情報を蓄積することで、色合いに応じて選択すべき推論モデルの判定が容易となる。 Note that the appearance of the image varies significantly depending on the color of the image. Therefore, the control unit 11 may control to switch the inference model to be used according to the hue by determining the hue of the captured image by image analysis of the captured image. By accumulating inference model use information regarding the adoption and non-adoption of the inference model used in this case, it becomes easy to determine the inference model to be selected according to the hue.

このように本実施の形態においては、第１の実施の形態と同様の効果を得ることができる。更に、本実施の形態においては、ユーザ操作に基づいて使用する推論モデルを切換えることができるだけでなく、推論モデル使用情報に基づく判定によって、使用する推論モデルを自動的に切換えたり、新しい推論モデルを外部機器に要求して自動的に推論モデルを組込んで利用したりすることも可能である。このように推論モデル使用情報を記録することで、推論モデルの有効性、利用範囲を判定し、推論モデルの有効利用の促進が可能である。
今後、様々な装置にＡＩを利用したガイドや自動制御、半自動制御が利用されていくが、どのような分野の装置、機器であっても、こうした「推論機能、操作関係データベース」は重要で、推論した結果がどの機能のどのような制御に影響し、ユーザにどのような影響を与えるのかを考慮して、そのユーザがそのガイドや自動制御、半自動制御に対して満足したかどうかの入力を行うことで、その機能が有効か否かが判定できる。つまり、その推論により実施された、あるいはされつつある機能を打ち消すような操作が何であるかを決めておかないと、その推論モデルがユーザに合っているかどうかがわからない。自動運転の車用の撮像部であれば、その推論結果で車が移動している時、ユーザがブレーキを踏んだ場合なども、本願のような推論モデルの有効性検証を行うにふさわしい状況である。こうした状況が頻発する場合は、推論モデルはそのユーザのためにカスタマイズした方が良い。本願のような工夫なければ、こうしたカスタマイズの必要性の正しい判定すらできない。カメラの場合、推論モデルを使って撮影した画像が、高い確率でユーザが消去していた場合、この推論モデルは不適当だと判定ができる。そのためには、その画像が推論モデルで撮影されたものであるかどうかを判定するための仕掛けが必要であって、この場合、「推論機能、操作関係データベース」には、推論機能は自動撮影であって、関連操作は画像消去操作というレコードが登録されていればよい。なお、この例では推論モデル使用情報は、対象の画像とは独立して記録しておく必要がある。或いは、推論モデル使用情報を画像のメタデータとして記録した場合には、当該画像を消去する前に、この画像はこの推論モデルで撮影したというメタデータを、それは消去された、という情報とともに、外部の解析用サーバに出力するようにしてもよい。サーバは取り込んだ推論モデル使用情報を解析して、新しい推論モデルが必要であるかどうかを判定できる。また、撮影チャンスを指示するガイドの提示を推論モデルを用いて行うカメラの場合、その指示に対して撮影を行わない場合を判定するなら、レリーズスイッチなどの操作が関連操作となる。さらに、推論モデルがユーザにふさわしい場合も同様の工夫で判定可能で、推論モデルで撮影した画像が必ず何度も再生されているような場合は、それが気に入ったと判定することができる。また、必ず、どこかにバックアップ記録されるような場合も同様の事が言える。このように、関連する操作は一つである必要はない。また、ある推論が特定の単一機能にのみ関連するわけでもない。顔を検出したら、フォーカスだけでなく露出なども合わせる場合があり、この場合、「推論機能、操作関係データベース」には機能が二つ、関連操作も複数になる場合がある。もちろん、データベースでなく、特定のプログラムで推論エンジンの利用とユーザ操作の関係を判定してもよく、特定の計算式で重みづけして判定するような場合も本願発明の技術範囲に含まれる。 Thus, in the present embodiment, the same effect as that of the first embodiment can be obtained. Furthermore, in this embodiment, not only the inference model to be used can be switched based on the user operation, but also the inference model to be used is automatically switched or a new inference model is determined based on the determination based on the inference model usage information. It is also possible to request an external device and automatically incorporate and use an inference model. By recording the inference model usage information in this way, it is possible to determine the validity and use range of the inference model and promote effective use of the inference model.
In the future, guidance, automatic control, and semi-automatic control using AI will be used for various devices, but these “inference functions and operation-related databases” are important in any field of devices and equipment. Consider whether the inferred result affects what control of which function and how it affects the user, and input whether the user is satisfied with the guide, automatic control, or semi-automatic control. By doing so, it can be determined whether or not the function is valid. In other words, it is not possible to know whether the inference model is suitable for the user unless the operation that cancels the function performed or being performed by the inference is determined. In the case of an imaging unit for an autonomous driving vehicle, it is suitable for verifying the validity of the inference model as in the present application even when the user is stepping on the brake when the car is moving according to the inference result. is there. If these situations occur frequently, the inference model should be customized for the user. Without such a contrivance as in the present application, it is impossible to correctly determine the necessity of such customization. In the case of a camera, if an image captured using an inference model has been deleted by a user with a high probability, it can be determined that the inference model is inappropriate. For this purpose, a mechanism is required to determine whether or not the image is taken with an inference model. In this case, the inference function is an automatic inference function in the “inference function, operation relation database”. Therefore, it is only necessary that a record of an image erasing operation is registered as the related operation. In this example, the inference model usage information needs to be recorded independently of the target image. Alternatively, if the inference model usage information is recorded as metadata of the image, before deleting the image, the metadata that the image was taken with this inference model, along with the information that it was deleted, along with external information You may make it output to the server for analysis. The server can analyze the captured inference model usage information and determine whether a new inference model is required. Further, in the case of a camera that uses a reasoning model to present a guide for instructing a shooting opportunity, an operation such as a release switch is a related operation if it is determined that shooting is not performed in response to the instruction. Further, when the inference model is suitable for the user, it can be determined by the same device. When an image photographed by the inference model is always reproduced many times, it can be determined that the user likes the inference model. The same thing can be said when a backup is recorded somewhere. Thus, the related operations do not have to be one. Nor is an inference related to a particular single function. When a face is detected, not only focus but also exposure may be adjusted. In this case, the “inference function and operation relation database” may have two functions and a plurality of related operations. Of course, the relationship between the use of the inference engine and the user operation may be determined by a specific program instead of the database, and the case where the determination is weighted with a specific calculation formula is also included in the technical scope of the present invention.

なお、上記実施の形態においては、画像取得装置は、外部機器に推論モデルの作成及び転送を依頼したが、推論モデルの作成はいずれの装置において実施してもよく、例えば、クラウド上のコンピュータを利用してもよい。 In the above-described embodiment, the image acquisition device requests the external device to create and transfer an inference model. However, the inference model may be created in any device, for example, a computer on the cloud. May be used.

上記実施の形態においては、撮像のための機器として、デジタルカメラを用いて説明したが、カメラとしては、デジタル一眼レフカメラでもコンパクトデジタルカメラでもよく、ビデオカメラ、ムービーカメラのような動画用のカメラでもよく、さらに、携帯電話やスマートフォンなど携帯情報端末（ＰＤＡ：Personal Digital Assist）等に内蔵されるカメラでも勿論構わない。また、撮像部が画像取得装置と別体になったものでもよく、例えば、スキャナのように、機械自体は移動しないが、撮像部だけが移動するような機械も想定の範囲である。また、微生物を観察するような用途では、顕微鏡やステージが移動する場合もある。また、内視鏡の例を説明したが、カプセル内視鏡やＣＴスキャナ等の観察装置にも適用可能である。 In the above embodiment, the digital camera is used as the imaging device. However, the camera may be a digital single lens reflex camera or a compact digital camera, and a video camera such as a video camera or movie camera. Of course, a camera built in a portable information terminal (PDA: Personal Digital Assist) such as a mobile phone or a smartphone may be used. The imaging unit may be separated from the image acquisition device. For example, a machine such as a scanner in which only the imaging unit moves although the machine itself does not move is also an assumed range. In applications such as observing microorganisms, the microscope or stage may move. Further, although an example of an endoscope has been described, the present invention can also be applied to an observation apparatus such as a capsule endoscope or a CT scanner.

本発明は、上記各実施形態にそのまま限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記各実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素の幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 The present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying the components without departing from the scope of the invention in the implementation stage. Moreover, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the above embodiments. For example, you may delete some components of all the components shown by embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

なお、特許請求の範囲、明細書、および図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。また、これらの動作フローを構成する各ステップは、発明の本質に影響しない部分については、適宜省略も可能であることは言うまでもない。 It should be noted that even if the operation flow in the claims, the description, and the drawings is described using “first,” “next,” etc. for convenience, it is essential to carry out in this order. It doesn't mean. In addition, it goes without saying that the steps constituting these operation flows can be omitted as appropriate for portions that do not affect the essence of the invention.

なお、ここで説明した技術のうち、主にフローチャートで説明した制御に関しては、プログラムで設定可能であることが多く、記録媒体や記録部に収められる場合もある。この記録媒体、記録部への記録の仕方は、製品出荷時に記録してもよく、配布された記録媒体を利用してもよく、インターネットを介してダウンロードしたものでもよい。 Of the techniques described here, the control mainly described in the flowchart is often settable by a program and may be stored in a recording medium or a recording unit. The recording method for the recording medium and the recording unit may be recorded at the time of product shipment, may be a distributed recording medium, or may be downloaded via the Internet.

なお、実施例中で、「部」（セクションやユニット）として記載した部分は、専用の回路や、複数の汎用の回路を組み合わせて構成してもよく、必要に応じて、予めプログラムされたソフトウェアに従って動作を行うマイコン、ＣＰＵなどのプロセッサ、あるいはＦＰＧＡなどシーケンサを組み合わせて構成されてもよい。また、その制御の一部または全部を外部の装置が引き受けるような設計も可能で、この場合、有線や無線の通信回路が介在する。通信は、ブルートゥースやＷｉＦｉ、電話回線などで行えばよく、ＵＳＢなどで行っても良い。専用の回路、汎用の回路や制御部を一体としてＡＳＩＣとして構成してもよい。移動部などは、様々なアクチュエータと、必要に応じて移動用の連結メカニズムによって構成されており、ドライバ回路によってアクチュエータが作動する。このドライブ回路もまた、特定のプログラムに従ってマイコンやＡＳＩＣなどが制御する。こうした制御は各種センサやその周辺回路が出力する情報によって、詳細な補正、調整などが行われても良い。また、推論モデルとか学習済モデルという言葉で人工知能が判断する学習結果で判断する実施例を説明したが、これは、単純なフローチャートや条件分岐、あるいは演算を伴う数値化判断等でも代替可能な場合がある。また、カメラの制御回路の演算能力が改善されることや、特定の状況や対象物に絞り込むことによって、機械学習の学習を撮像装置内で実施してもよい。 In the embodiment, the portion described as “part” (section or unit) may be configured by combining a dedicated circuit or a plurality of general-purpose circuits, and pre-programmed software as necessary. May be configured by combining a microcomputer that operates according to the above, a processor such as a CPU, or a sequencer such as an FPGA. In addition, a design in which an external device takes over part or all of the control is possible. In this case, a wired or wireless communication circuit is interposed. Communication may be performed by Bluetooth, WiFi, a telephone line, etc., and may be performed by USB. A dedicated circuit, a general-purpose circuit, or a control unit may be integrated into an ASIC. The moving unit or the like is configured by various actuators and, if necessary, a moving connection mechanism, and the actuator is operated by a driver circuit. This drive circuit is also controlled by a microcomputer or ASIC according to a specific program. For such control, detailed correction and adjustment may be performed according to information output from various sensors and their peripheral circuits. In addition, although an example in which the judgment is made based on the learning result determined by the artificial intelligence in terms of an inference model or a learned model has been described, this can also be replaced by a simple flowchart, conditional branching, or numerical judgment involving computation. There is a case. In addition, the learning of machine learning may be performed in the imaging apparatus by improving the computing capability of the control circuit of the camera or by narrowing down to a specific situation or object.

１０…画像取得装置、１１…制御部、１１ａ…撮像制御部、１１ｂ…画像処理部、１１ｃ…記録制御部、１１ｄ…設定制御部、１１ｅ…通信制御部、１１ｆ…表示制御部、１２…推論エンジン、１２ａ…記憶部、１２ａ１，１２ａ２…辞書、１３…操作部、１４…通信部、１５…表示部、１６…記録部、１６ａ…画像データ記録領域、１６ｂ…メタデータ記録領域、２０…撮像装置、２２…撮像部、２２ａ…撮像素子、２２ｂ…光学系、３０…外部機器、３１…学習部、３１ａ…母集合作成部、３１ｂ…通信部、３１ｃ…出力設定部、３１ｄ…入出力モデル化部、３２…外部画像ＤＢ、３４…画像分類機能部。 DESCRIPTION OF SYMBOLS 10 ... Image acquisition apparatus, 11 ... Control part, 11a ... Imaging control part, 11b ... Image processing part, 11c ... Recording control part, 11d ... Setting control part, 11e ... Communication control part, 11f ... Display control part, 12 ... Inference Engine, 12a ... Storage unit, 12a1, 12a2 ... Dictionary, 13 ... Operating unit, 14 ... Communication unit, 15 ... Display unit, 16 ... Recording unit, 16a ... Image data recording area, 16b ... Metadata recording area, 20 ... Imaging Apparatus, 22 ... Imaging unit, 22a ... Imaging element, 22b ... Optical system, 30 ... External device, 31 ... Learning unit, 31a ... Population creation unit, 31b ... Communication unit, 31c ... Output setting unit, 31d ... Input / output model Conversion unit, 32 ... external image DB, 34 ... image classification function unit.

Claims

An image acquisition unit for acquiring images;
Using a predetermined inference model, an inference unit that performs inference using the image acquired by the image acquisition unit as an input,
A presentation unit for presenting the inference result of the inference unit;
A determination unit for determining whether or not the inference result is adopted;
Based on the determination result of the determination unit, create usage information regarding the use of the inference model, and record the generated usage information as metadata of the image acquired by the image acquisition unit;
An image acquisition apparatus comprising:

The image acquisition device has an operation unit for a user to operate,
The image acquisition apparatus according to claim 1, wherein the determination unit determines whether the inference result is adopted according to the operation result of the operation unit and the content of the inference result.

The determination unit determines whether or not the inference result is adopted according to whether or not the operation of the operation unit has an influence on an item related to the presentation content of the inference result. The image acquisition device according to claim 2.

The inference section has a plurality of inference models,
The image acquisition apparatus according to claim 1, wherein the control unit records the metadata including the information indicating which inference model is used in the usage information as the metadata.

When the determination unit obtains a determination result indicating that the inference result is not adopted by the determination unit, the control unit automatically records the image acquired by the image acquisition unit together with the metadata. The image acquisition apparatus according to claim 1, wherein:

The image acquisition device according to claim 1, wherein the presentation unit issues a warning when the determination unit obtains a determination result indicating that the inference result is not adopted.

The presenting unit performs warning by changing a warning method when a determination result indicating that the inference result is not adopted by the determination unit is obtained a predetermined number of times or more. 6. The image acquisition device according to 6.

The image acquisition apparatus according to claim 1, wherein the presentation unit includes a display unit that displays the image, and displays a display indicating the inference result on the display unit.

The control unit requests an external device to create and transfer a new inference model used by the inference unit when the determination unit obtains a determination result indicating that the inference result is not adopted. The image acquisition apparatus according to claim 1.

The control unit performs control to switch the inference model used by the inference unit when the determination result indicating that the inference result is not adopted by the determination unit is obtained a number of times equal to or greater than a predetermined threshold. If the determination result indicating that the inference result is not adopted by the determination unit is obtained a number of times greater than or equal to a predetermined threshold, the creation and transfer of a new inference model used by the inference unit is externally performed. The image acquisition apparatus according to claim 1, wherein the image acquisition apparatus requests an apparatus.

An image acquisition step of acquiring an image;
An inference step for performing an inference using the image acquired in the image acquisition step as an input, using a predetermined inference model;
A presenting step for presenting the inference result in the inference step;
A determination step of determining whether or not the inference result is adopted;
A control step of creating usage information regarding the use of the inference model based on the determination result in the determination step, and recording the generated usage information as metadata of the image acquired in the image acquisition step;
An image acquisition method comprising:

On the computer,
An image acquisition step of acquiring an image;
An inference step for performing an inference using the image acquired in the image acquisition step as an input, using a predetermined inference model;
A presenting step for presenting the inference result in the inference step;
A determination step of determining whether or not the inference result is adopted;
A control step of creating usage information regarding the use of the inference model based on the determination result in the determination step, and recording the generated usage information as metadata of the image acquired in the image acquisition step;
An image acquisition program characterized in that