JP7166951B2

JP7166951B2 - Learning request device, learning device, inference model utilization device, inference model utilization method, inference model utilization program, and imaging device

Info

Publication number: JP7166951B2
Application number: JP2019021597A
Authority: JP
Inventors: 哲也豊田; 和寛羽田; 大伊藤; 和彦長; 修野中
Original assignee: Olympus Corp
Current assignee: Olympus Corp
Priority date: 2019-02-08
Filing date: 2019-02-08
Publication date: 2022-11-08
Anticipated expiration: 2039-02-08
Also published as: JP2020129268A

Description

本発明は、利用条件との整合性に留意した学習依頼装置、学習装置、推論モデル利用装置、推論モデル利用方法、推論モデル利用プログラム及び撮像装置に関する。 TECHNICAL FIELD The present invention relates to a learning request device, a learning device, an inference model utilization device, an inference model utilization method, an inference model utilization program, and an imaging device, which pay attention to consistency with usage conditions.

従来、ロジック回路を用いて、人が記述したルールに従って処理を行うルールベース制御の装置が採用されることがある。ルールベース制御による推論を行うことで、各種問題を解決することが可能である。また、近年、深層学習等の機械学習により生成した推論モデルを利用して各種問題を解決するコンピュータシステムも採用されるようになってきている。特に、深層学習を利用した装置は、例えば、画像解析、音声解析、自然言語処理等について極めて有効な推論を行うことができる。 2. Description of the Related Art Conventionally, a rule-based control device that performs processing according to a rule written by a person using a logic circuit may be employed. Various problems can be solved by making inferences based on rule-based control. Also, in recent years, computer systems that solve various problems using inference models generated by machine learning such as deep learning have come into use. In particular, devices using deep learning can make extremely effective inferences, for example, for image analysis, speech analysis, natural language processing, and the like.

例えば、特許文献１には、異常検知に際して、異常の有無を判断するための異常度の閾値を定める技術が開示されている。特許文献１の装置においては、異常の検知対象から収集されたデータを学習した学習結果を利用することで異常度を算出するようになっている。 For example, Patent Literature 1 discloses a technique for determining a threshold value of the degree of abnormality for determining the presence or absence of an abnormality when detecting an abnormality. In the device of Patent Document 1, the degree of abnormality is calculated by using the learning result obtained by learning the data collected from the detection target of abnormality.

特開２０１８－１４８３５０号公報JP 2018-148350 A

機械学習により有効な推論モデルを得るためには、学習の仕方や学習に用いる教師データとして適切なものを採用する必要がある。機械学習による推論の有効性は学習の影響を受け、また、十分な学習を行ったとしても正しい推論が得られるとは限らない。場合によっては、ルールベースを用いた推論の方が、機械学習による推論よりもより正しい推論結果が得られることもある。 In order to obtain an effective inference model by machine learning, it is necessary to adopt appropriate learning methods and training data. The effectiveness of inference by machine learning is affected by learning, and even with sufficient learning, correct inference may not always be obtained. In some cases, rule-based inference can yield more correct inference results than machine learning inference.

そこで、ルールベースを用いた推論の利点と機械学習による推論の利点とを考慮した仕組みが重要であるが、学習の仕方や推論モデルの利用方法等について十分に考慮されたシステムは開発されていない。 Therefore, it is important to have a mechanism that considers the advantages of rule-based inference and machine learning inference. .

本発明は、これまでのようにルールベースのロジック回路の利点を考慮し、機械学習による推論を利用する装置を有効に機能させることを可能にすることができる学習依頼装置、学習装置、推論モデル利用装置、推論モデル利用方法、推論モデル利用プログラム及び撮像装置を提供することを目的とする。 The present invention considers the advantages of rule-based logic circuits as in the past, and provides a learning request device, a learning device, and an inference model that can enable devices that utilize inference by machine learning to function effectively. An object of the present invention is to provide a utilization device, an inference model utilization method, an inference model utilization program, and an imaging device.

本発明の一態様による学習依頼装置は、推論モデルの仕様を要求するための複数の仕様項目の設定を記述した仕様情報を生成する仕様設定部と、上記仕様情報を教師データと共に上記推論モデルを生成する学習装置に送信するための制御を行う制御部とを具備し、上記仕様設定部は、上記仕様情報中に、上記推論モデルを利用する推論エンジンが、他の併用推論機器と連携して推論を行うか否かを設定する仕様項目を含める。 A learning request device according to an aspect of the present invention includes a specification setting unit that generates specification information describing settings of a plurality of specification items for requesting specification of an inference model; and a control unit for controlling transmission to the learning device to be generated, and the specification setting unit includes, in the specification information, the inference engine using the inference model in cooperation with other combined inference devices. Include a specification item that sets whether to perform inference.

本発明の一態様による学習装置は、推論モデルの仕様を決定するための複数の仕様項目の設定を記述した仕様情報であって、上記推論モデルを利用する推論エンジンが他の併用推論機器と連携して推論を行うための仕様情報に基づいて上記推論モデルを構築する推論モデル化部と、上記仕様情報に基づいて上記併用推論機器との連携の仕方に関する推論設定情報を上記推論モデルを構築するための推論モデル情報に付加して送信するための制御を行う制御部とを具備する。 A learning device according to an aspect of the present invention has specification information describing settings of a plurality of specification items for determining specifications of an inference model, and an inference engine using the inference model cooperates with other concurrent inference devices. and an inference modeling unit that constructs the inference model based on the specification information for performing inference, and the inference setting information on how to cooperate with the combined inference device based on the specification information to construct the inference model. and a control unit that performs control for adding to and transmitting inference model information for.

本発明の一態様による推論モデル利用装置は、推論エンジンと併用推論機器とが連携して推論を行うことを前提にした推論モデルを構築するための推論モデル情報であって、上記推論エンジンと上記併用推論機器との連携の仕方に関する推論設定情報が付加された上記推論モデル情報を受信する通信部と、上記推論モデル情報に基づいて構成される推論モデルを用いて推論を行う上記推論エンジンと、上記併用推論機器と、上記推論設定情報に基づいて、上記推論エンジンと上記併用推論機器とを連携させて推論を実行させる制御部とを具備する。 An inference model utilization device according to an aspect of the present invention is inference model information for constructing an inference model on the premise that an inference engine and a combined inference device cooperate to perform inference, the inference engine and the The communication unit that receives the inference model information to which the inference setting information on how to cooperate with the combined inference device is added, the inference engine that performs inference using the inference model configured based on the inference model information, and a control unit that causes the inference engine and the combined inference device to cooperate and execute inference based on the inference setting information.

本発明の一態様による推論モデル利用方法は、推論エンジンと併用推論機器とが連携して推論を行うことを前提にした推論モデルを構築するための推論モデル情報であって、上記推論エンジンと上記併用推論機器との連携の仕方に関する推論設定情報が付加された上記推論モデル情報を受信し、上記推論モデル情報に基づいて構成される推論モデルを用いて上記推論エンジンを構築し、上記推論設定情報に基づいて、上記推論エンジンと上記併用推論機器とを連携させて推論を実行させる。 An inference model utilization method according to an aspect of the present invention is inference model information for constructing an inference model on the premise that an inference engine and a combined inference device cooperate to perform inference. receiving the inference model information to which the inference setting information about how to cooperate with the combined inference device is added, constructing the inference engine using the inference model configured based on the inference model information, and the inference setting information Based on, the inference engine and the combined inference device are linked to execute inference.

本発明の一態様による推論モデル利用プログラムは、コンピュータに、推論エンジンと併用推論機器とが連携して推論を行うことを前提にした推論モデルを構築するための推論モデル情報であって、上記推論エンジンと上記併用推論機器との連携の仕方に関する推論設定情報が付加された上記推論モデル情報を受信し、上記推論モデル情報に基づいて構成される推論モデルを用いて上記推論エンジンを構築し、上記推論設定情報に基づいて、上記推論エンジンと上記併用推論機器とを連携させて推論を実行させる手順を実行させる。 An inference model utilization program according to one aspect of the present invention is inference model information for building an inference model on the premise that an inference engine and a combined inference device cooperate to perform inference in a computer, receiving the inference model information to which inference setting information on how to link the engine and the combined inference device is added, constructing the inference engine using the inference model configured based on the inference model information; Based on the inference setting information, the inference engine and the combined inference device are linked to execute a procedure for executing inference.

本発明の一態様による撮像装置は、被写体の撮像画像を取得する撮像部と、推論エンジンと併用推論機器とが連携して推論を行うことを前提にした推論モデルを構築するための推論モデル情報であって、上記推論エンジンと上記併用推論機器との連携の仕方に関する推論設定情報が付加された上記推論モデル情報を受信する通信部と、上記推論モデル情報に基づいて構成される推論モデルを用いて推論を行う上記推論エンジンと、上記併用推論機器と、上記推論設定情報に基づいて、上記推論エンジンと上記併用推論機器とを連携させて推論を実行させることにより、上記撮像画像から所定の対象物を検出する制御部とを具備する。 An imaging apparatus according to an aspect of the present invention includes inference model information for constructing an inference model on the assumption that an imaging unit that acquires a captured image of a subject, an inference engine, and a combination inference device cooperate to perform inference. using a communication unit that receives the inference model information to which inference setting information on how to link the inference engine and the combination inference device is added, and an inference model configured based on the inference model information. The inference engine and the combined inference device perform inference by using the inference engine, the combined inference device, and the inference setting information. and a control unit for detecting an object.

本発明によれば、機械学習による推論を利用する装置を有効に機能させることを可能にすることができるという効果を有する。 ADVANTAGE OF THE INVENTION According to this invention, it has the effect that it can enable the apparatus using the inference by machine learning to function effectively.

本発明の第１の実施の形態に係る学習依頼装置、学習装置及び撮像装置を示すブロック図である。1 is a block diagram showing a learning request device, a learning device, and an imaging device according to a first embodiment of the present invention; FIG. ルールベースエンジンを用いた処理を示す説明図である。FIG. 10 is an explanatory diagram showing processing using a rule-based engine; 推論エンジンを用いた処理を示す説明図である。FIG. 10 is an explanatory diagram showing processing using an inference engine; ルールベースエンジンと推論エンジンとを連携して用いた処理を示す説明図である。FIG. 10 is an explanatory diagram showing processing using a rule base engine and an inference engine in cooperation; 表示部１４の表示画面１４ａ上に表示された仕様設定メニュー表示ＤＭの一例を示す説明図である。3 is an explanatory diagram showing an example of a specification setting menu display DM displayed on a display screen 14a of a display unit 14; FIG. 推論エンジンと併用推論機器との連携の仕方の例を説明するための説明図である。FIG. 10 is an explanatory diagram for explaining an example of how the inference engine and the combined inference device are linked; 推論エンジンと併用推論機器との連携の仕方の例を説明するための説明図である。FIG. 10 is an explanatory diagram for explaining an example of how the inference engine and the combined inference device are linked; 推論エンジンと併用推論機器との連携の仕方の例を説明するための説明図である。FIG. 10 is an explanatory diagram for explaining an example of how the inference engine and the combined inference device are linked; 撮像装置１０による撮影の様子を示す説明図である。FIG. 2 is an explanatory diagram showing how an image is captured by the imaging device 10; 学習依頼装置３０の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of the learning requesting device 30; 学習装置２０の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of the learning device 20; 第１手法における学習装置２０による学習及び学習の結果得られる推論モデルを説明するための説明図である。FIG. 11 is an explanatory diagram for explaining learning by the learning device 20 in the first technique and an inference model obtained as a result of the learning; 撮像装置１０の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of the imaging device 10; ルールベースエンジン１８による推論を説明するための説明図である。FIG. 4 is an explanatory diagram for explaining inference by the rule base engine 18; 推論エンジン１７による推論を説明するための説明図である。4 is an explanatory diagram for explaining inference by an inference engine 17; FIG. 一連の推論の検出結果に基づく画像表示の例を示す説明図である。FIG. 10 is an explanatory diagram showing an example of image display based on a series of inference detection results; 第２手法における学習装置２０による学習及び学習の結果得られる推論モデルを説明するための説明図である。FIG. 11 is an explanatory diagram for explaining learning by the learning device 20 in the second method and an inference model obtained as a result of the learning; 撮像装置１０の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of the imaging device 10; 撮像装置１０の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of the imaging device 10; 第２の実施の形態を示すブロック図である。It is a block diagram showing a second embodiment. 第２の実施の形態を示すブロック図である。It is a block diagram showing a second embodiment.

以下、図面を参照して本発明の実施の形態について詳細に説明する。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（第１の実施の形態）
本発明は、機械学習による推論モデルを有効に活用するためのもので、推論モデルを利用するハードウェア、装置との整合性を考慮したものである。
推論モデルを機械学習によって得る場合、学習時に与える教師データや学習用のコンピュータの性能と、実使用時の入力データと実使用時の電子回路の性能との差異を十分に考慮しなければ、思惑通りの推論結果を得ることが出来ない。したがって、推論モデル利用時の制約条件等を十分に考慮した、学習のステップが重要となる。このいわば、「推論モデルの要求仕様」を標準化できるような工夫によって、きわめて自由度の高い学習環境や、豊富な教師データ類を活用可能にして、利用環境下において、さらに高精度な出力が可能な推論モデルの取得を可能としている。 (First embodiment)
The present invention is for effectively utilizing an inference model based on machine learning, and considers compatibility with hardware and devices that use the inference model.
When an inference model is obtained by machine learning, it is necessary to fully consider the difference between the performance of the training computer and the training data given during learning, and the performance of the input data and the electronic circuit during actual use. I can't get the correct inference result. Therefore, it is important to have a learning step that fully considers the constraints when using the inference model. In other words, by devising ways to standardize the "required specifications of the inference model", it is possible to utilize a learning environment with a high degree of freedom and a wealth of training data, enabling even more accurate output under the usage environment. It is possible to obtain a reasonable inference model.

図１は本発明の第１の実施の形態に係る学習依頼装置、学習装置及び撮像装置を示すブロック図である。本実施の形態は機械学習による推論モデルを有効に活用する一例として、推論モデルを利用する推論エンジンの他に併用推論機器を有する装置における推論モデルの活用例について説明する。即ち、本実施の形態は、ルールベースを用いた推論を実行する併用推論機器であるルールベースエンジン及び機械学習による推論を実行する推論エンジンを有する装置において、より有効な推論を実現するために、学習のための仕様情報を作成し、学習結果の推論モデルの利用の仕方等を示す推論設定情報を含む推論モデル情報を生成し、推論モデル情報を用いてルールベースエンジンと推論エンジンとを効果的に動作させることを可能にするものである。なお、図１では、ルールベースエンジン及び推論エンジンを有する装置として撮像装置を例に説明するが、撮像装置に限定されるものではない。また、併用推論機器として、ルールベースエンジンを採用する例を説明するが、機械学習による推論を実行する推論エンジン、例えば推論モデルが更新不能な推論エンジンを採用してもよい。
なお、推論モデルの活用例として、推論モデルを利用する推論エンジンが併用推論機器と連携して推論を行う例を説明するが、併用推論機器が存在しない場合においても、「推論モデルの要求仕様」を記述した仕様情報の作成、仕様情報に基づいて作成された推論モデルの利用の仕方等を示す推論設定情報を含む推論モデル情報の生成によって、推論モデルを有効に活用できることは以下の説明から明らかである。 FIG. 1 is a block diagram showing a learning request device, a learning device, and an imaging device according to the first embodiment of the present invention. As an example of effectively utilizing an inference model based on machine learning, the present embodiment will describe an example of utilization of an inference model in a device having an inference device used in addition to an inference engine that uses the inference model. That is, in the present embodiment, in order to realize more effective inference in an apparatus having a rule-based engine, which is a combined inference device that executes inference using a rule base, and an inference engine that executes inference by machine learning, Create specification information for learning, generate inference model information including inference setting information indicating how to use the inference model of the learning result, etc., and use the inference model information to effectively operate the rule-based engine and the inference engine. It is possible to operate Note that FIG. 1 illustrates an imaging device as an example of a device having a rule base engine and an inference engine, but the device is not limited to the imaging device. Moreover, although an example in which a rule-based engine is employed as a combined inference device will be described, an inference engine that executes inference by machine learning, for example, an inference engine whose inference model cannot be updated may be employed.
As an example of using an inference model, an inference engine that uses the inference model performs inference in cooperation with a combined inference device. It is clear from the following explanation that the inference model can be effectively used by creating specification information that describes and generating inference model information that includes inference setting information that indicates how to use the inference model created based on the specification information. is.

先ず、図２Ａから図２Ｃを参照して、ルールベースエンジンと推論エンジンとを連携して用いる例について説明する。図２Ａはルールベースエンジンを用いた処理を示す説明図であり、図２Ｂは推論エンジンを用いた処理を示す説明図であり、図２Ｃはルールベースエンジンと推論エンジンとを連携して用いた処理を示す説明図である。図２Ａから図２Ｃは撮像装置のホワイトバランスゲイン（ＷＢゲイン）を求めるフローを示している。なお、図２Ａから図２Ｃにおいて、推論エンジンによる処理はハッチングにて示してある。 First, with reference to FIGS. 2A to 2C, an example of cooperative use of the rule-based engine and the inference engine will be described. FIG. 2A is an explanatory diagram showing processing using the rule-based engine, FIG. 2B is an explanatory diagram showing processing using the inference engine, and FIG. 2C is processing using the rule-based engine and the inference engine in cooperation. It is an explanatory view showing . FIGS. 2A to 2C show a flow for obtaining the white balance gain (WB gain) of the imaging device. In addition, in FIGS. 2A to 2C, processing by the inference engine is indicated by hatching.

図２Ａの例では、画像入力に対して、光源色判定及び被写体色判定が行われる。ルールベースエンジンは、これらの判定結果を用いて、予め設定されているルールに従って総合判定を行い、ホワイトバランスゲインを求める。図２Ｂの例では、画像入力に対して、推論エンジンは、ホワイトバランスゲインを推定する（ＷＢ推定）。 In the example of FIG. 2A, light source color determination and subject color determination are performed for image input. Using these determination results, the rule-based engine makes a comprehensive determination according to a preset rule, and obtains a white balance gain. In the example of FIG. 2B, for the image input, the inference engine estimates the white balance gain (WB estimation).

図２Ｃの例において、推論エンジンは、画像入力に対してホワイトバランスゲインの推論結果を直接出力するのではなく、先ず入力画像から光源マップを求める。一方、ルールベースエンジンは、光源色を判定する。推論エンジンは、ルールベースエンジンが求めた光源色を光源マップを用いて補正する。そして、推論エンジンによる光源色の補正結果と被写体色の判定結果とに基づく総合判定により、ホワイトバランスゲインが求められる。 In the example of FIG. 2C, the inference engine first obtains a light source map from the input image, rather than directly outputting the white balance gain inference result for the image input. The rule-based engine, on the other hand, determines the light source color. The inference engine corrects the light source color determined by the rule-based engine using the light source map. Then, the white balance gain is obtained by comprehensive determination based on the correction result of the light source color and the determination result of the subject color by the inference engine.

図１の撮像装置１０は、このように連携して推論を行う推論エンジン１７とルールベースエンジン１８とを備える。撮像装置１０は、被写体を撮像して得た画像を記録する。撮像装置１０としては、デジタルカメラやビデオカメラだけでなく、スマートフォンやタブレット端末に内蔵されるカメラを採用してもよい。なお、推論エンジンがルールベースエンジンの機能をカバーする場合は、ルールベースエンジンはなくてもよい。 The imaging apparatus 10 of FIG. 1 includes an inference engine 17 and a rule base engine 18 that cooperate to perform inference in this way. The imaging device 10 records an image obtained by imaging a subject. As the imaging device 10, not only a digital camera or a video camera but also a camera built into a smartphone or a tablet terminal may be adopted. Note that if the inference engine covers the functions of the rule-based engine, the rule-based engine may be omitted.

撮像装置１０は、撮像装置１０の各部を制御する制御部１１を備えている。制御部１１は、ＣＰＵ（Central Processing Unit）等を用いたプロセッサによって構成されて、図示しないメモリに記憶されたプログラムに従って動作して各部を制御するものであってもよいし、ハードウェアの電子回路で機能の一部又は全部を実現するものであってもよい。 The imaging device 10 includes a control section 11 that controls each section of the imaging device 10 . The control unit 11 may be configured by a processor using a CPU (Central Processing Unit) or the like, and may operate according to a program stored in a memory (not shown) to control each unit, or may be a hardware electronic circuit. may implement some or all of the functions.

撮像装置１０の撮像部１２は、撮像素子１２ａ及び光学系１２ｂを有している。光学系１２ｂは、ズームやフォーカシングのための図示しないレンズや絞り等を備えている。光学系１２ｂは、これらのレンズを駆動する図示しないズーム（変倍）機構、ピント及び絞り機構を備えている。 The imaging unit 12 of the imaging device 10 has an imaging element 12a and an optical system 12b. The optical system 12b includes a lens, a diaphragm, and the like (not shown) for zooming and focusing. The optical system 12b includes a zoom (variable magnification) mechanism, focus and diaphragm mechanism (not shown) for driving these lenses.

撮像素子１２ａは、ＣＣＤやＣＭＯＳセンサ等によって構成されており、光学系１２ｂによって被写体光学像が撮像素子１２ａの撮像面に導かれるようになっている。撮像素子１２ａは、被写体光学像を光電変換して被写体の撮像画像（撮像信号）を取得する。 The imaging device 12a is composed of a CCD, CMOS sensor, or the like, and an optical image of a subject is guided to an imaging surface of the imaging device 12a by an optical system 12b. The imaging device 12a photoelectrically converts the subject optical image to acquire a captured image (image capturing signal) of the subject.

制御部１１の撮像制御部１１ａは、光学系１２ｂのズーム機構、ピント機構及び絞り機構を駆動制御して、ズーム、絞り及びピントを調節することができるようになっている。撮像部１２は、撮像制御部１１ａに制御されて撮像を行い、撮像画像（動画像及び静止画像）の撮像信号を制御部１１に出力する。 The imaging control unit 11a of the control unit 11 drives and controls the zoom mechanism, focus mechanism, and aperture mechanism of the optical system 12b to adjust zoom, aperture, and focus. The imaging unit 12 performs imaging under the control of the imaging control unit 11 a and outputs an imaging signal of the captured image (moving image and still image) to the control unit 11 .

撮像装置１０には操作部１３が設けられている。操作部１３は、図示しないレリーズボタン、ファンクションボタン、撮影モード設定、パラメータ操作等の各種スイッチ、ダイヤル、リング部材等を含み、ユーザ操作に基づく操作信号を制御部１１に出力する。制御部１１は、操作部１３からの操作信号に基づいて、各部を制御するようになっている。 An operation unit 13 is provided in the imaging device 10 . The operation unit 13 includes a release button (not shown), function buttons, various switches such as shooting mode setting and parameter operation, dials, ring members, and the like, and outputs operation signals to the control unit 11 based on user operations. The control section 11 controls each section based on an operation signal from the operation section 13 .

制御部１１は、撮像部１２からの撮像画像（動画像及び静止画像）を取込む。制御部１１の画像処理部１１ｂは、取込んだ撮像画像に対して、所定の信号処理、例えば、色調整処理、マトリックス変換処理、ノイズ除去処理、その他各種の信号処理を行う。 The control unit 11 captures captured images (moving images and still images) from the imaging unit 12 . The image processing unit 11b of the control unit 11 performs predetermined signal processing such as color adjustment processing, matrix conversion processing, noise removal processing, and other various signal processing on the captured image.

撮像装置１０には表示部１４が設けられており、制御部１１には、表示制御部１１ｆが設けられている。表示部１４は、例えば、ＬＣＤ（液晶表示装置）等の表示画面を有する表示器であり、表示画面は撮像装置１０の例えば筐体背面等に設けられる。表示制御部１１ｆは、画像処理部１１ｂによって信号処理された撮像画像を表示部１４に表示させるようになっている。また、表示制御部１１ｆは、撮像装置１０の各種メニュー表示や警告表示等を表示部１４に表示させることもできるようになっている。 The imaging device 10 is provided with a display section 14, and the control section 11 is provided with a display control section 11f. The display unit 14 is, for example, a display device having a display screen such as an LCD (liquid crystal display device). The display control unit 11f causes the display unit 14 to display the picked-up image signal-processed by the image processing unit 11b. The display control unit 11f can also cause the display unit 14 to display various menu displays, warning displays, and the like of the imaging device 10. FIG.

撮像装置１０には通信部１５が設けられており、制御部１１には、通信制御部１１ｅが設けられている。通信部１５は、通信制御部１１ｅに制御されて、学習装置２０及び学習依頼装置３０との間で情報を送受することができるようになっている。通信部１５は、例えば、ブルートゥース（登録商標）等の近距離無線による通信及び例えば、Ｗｉ－Ｆｉ（登録商標）等の無線ＬＡＮによる通信が可能である。なお、通信部１５は、ブルートゥースやＷｉ－Ｆｉに限らず、各種通信方式での通信を採用することが可能である。通信制御部１１ｅは、通信部１５を介して、学習装置２０から推論モデル情報を受信することができる。この推論モデル情報は、推論エンジン１７のネットワーク１７ａにより所望の推論モデルを構築するためのものである。 The imaging device 10 is provided with a communication section 15, and the control section 11 is provided with a communication control section 11e. The communication unit 15 is controlled by the communication control unit 11e so that information can be transmitted and received between the learning device 20 and the learning request device 30. FIG. The communication unit 15 is capable of short-range wireless communication such as Bluetooth (registered trademark) and wireless LAN communication such as Wi-Fi (registered trademark). It should be noted that the communication unit 15 is not limited to Bluetooth and Wi-Fi, and can employ various communication methods. The communication control unit 11 e can receive the inference model information from the learning device 20 via the communication unit 15 . This inference model information is for constructing a desired inference model by the network 17 a of the inference engine 17 .

制御部１１には記録制御部１１ｃが設けられている。記録制御部１１ｃは、信号処理後の撮像画像を圧縮処理し、圧縮後の画像を記録部１６に与えて記録させることができる。記録部１６は、所定の記録媒体によって構成されて、制御部１１から与えられた情報を記録すると共に、記録されている情報を制御部１１に出力することができる。また、記録部１６としては、例えばカードインターフェースを採用してもよく、この場合には記録部１６はメモリカード等の記録媒体に画像データを記録可能である。 The control unit 11 is provided with a recording control unit 11c. The recording control unit 11c can compress the picked-up image after the signal processing, and supply the compressed image to the recording unit 16 for recording. The recording unit 16 is composed of a predetermined recording medium, and can record information given from the control unit 11 and output the recorded information to the control unit 11 . For example, a card interface may be used as the recording unit 16. In this case, the recording unit 16 can record image data on a recording medium such as a memory card.

記録部１６は、画像データ記録領域１６ａを有しており、記録制御部１１ｃは、画像データを画像データ記録領域１６ａに記録するようになっている。また、記録制御部１１ｃは、記録部１６に記録されている情報を読み出して再生することも可能である。 The recording section 16 has an image data recording area 16a, and the recording control section 11c records image data in the image data recording area 16a. The recording control unit 11c can also read and reproduce information recorded in the recording unit 16. FIG.

また、記録部１６は、推論設定記録領域１６ｂを有している。記録制御部１１ｃは、受信された推論モデル情報を記録部１６の推論設定記録領域１６ｂに記録することができるようになっている。こうして、推論設定記録領域１６ｂには、推論エンジン１７を構成するネットワーク１７ａの設定に関する推論モデル情報が記録されるようになっている。なお、推論モデル情報には、上述したように、推論設定情報も含まれる。また、記録部１６のテストデータ記録領域１６ｃには、推論エンジン１７の動作を検証するためのテストデータが記録される。 The recording unit 16 also has an inference setting recording area 16b. The recording control unit 11 c can record the received inference model information in the inference setting recording area 16 b of the recording unit 16 . In this way, the inference model information regarding the setting of the network 17a that constitutes the inference engine 17 is recorded in the inference setting recording area 16b. Note that the inference model information also includes the inference setting information as described above. Test data for verifying the operation of the inference engine 17 is recorded in the test data recording area 16 c of the recording unit 16 .

推論設定情報は、後述する仕様情報に基づいて生成されるものであり、所望の推論結果を得るための推論エンジン１７と併用推論機器であるルールベースエンジン１８との連携の仕方を規定するものである。推論設定情報としては、仕様情報をそのまま用いてもよく、また、仕様情報に基づいて生成されていてもよい。例えば、推論設定情報により、推論エンジン１７とルールベースエンジン１８との入出力の関係等が示される。 The inference setting information is generated based on specification information to be described later, and defines how the inference engine 17 and the rule base engine 18, which is a concurrent inference device, cooperate to obtain a desired inference result. be. As the inference setting information, the specification information may be used as it is, or it may be generated based on the specification information. For example, the inference setting information indicates the input/output relationship between the inference engine 17 and the rule base engine 18 .

ルールベースエンジン１８は、ロジック回路１８ａを備えている。ロジック回路１８ａは、予め設定されたルールに従って推論を行うものであり、入力に対する推論によって出力を得る。図１の例では、ロジック回路１８ａは、撮像部１２からの撮像画像が入力されて、推論結果を出力する。本実施の形態においては、ルールベースエンジン１８からの推論結果は、制御部１１に供給されると共に推論エンジン１７にも供給される場合がある。 The rule base engine 18 comprises a logic circuit 18a. The logic circuit 18a makes inferences according to preset rules, and obtains outputs by inferences with respect to inputs. In the example of FIG. 1, the logic circuit 18a receives the captured image from the imaging unit 12 and outputs an inference result. In this embodiment, the inference result from the rule base engine 18 may be supplied to the inference engine 17 as well as to the control unit 11 .

推論エンジン１７は、ネットワーク１７ａを有している。ネットワーク１７ａは、推論設定記録領域１６ｂに記録されている推論モデル情報に含まれる設定値を用いて構築されており、機械学習における学習が完了することによって得られるネットワーク、即ち、推論モデルを構成する。図１の例では、ネットワーク１７ａは、撮像部１２からの撮像画像が入力されて、推論結果を出力する。本実施の形態においては、推論エンジン１７からの推論結果は、制御部１１に供給されると共にルールベースエンジン１８にも供給される場合がある。 The inference engine 17 has a network 17a. The network 17a is constructed using setting values included in the inference model information recorded in the inference setting recording area 16b, and constitutes a network obtained by completing learning in machine learning, that is, an inference model. . In the example of FIG. 1, the network 17a receives a captured image from the imaging unit 12 and outputs an inference result. In this embodiment, the inference result from the inference engine 17 may be supplied to the control unit 11 and also to the rule base engine 18 .

即ち、本実施の形態においては、推論エンジン１７は、撮像部１２からの撮像画像だけでなく、ルールベースエンジン１８の推論結果が与えられて推論を行い、推論結果を得る場合がある。同様に、ルールベースエンジン１８は、撮像部１２からの撮像画像だけでなく、推論エンジン１７の推論結果が与えられて推論を行い、推論結果を得る場合がある。 That is, in the present embodiment, the inference engine 17 may be given not only the captured image from the imaging unit 12 but also the inference result of the rule base engine 18 to perform inference and obtain the inference result. Similarly, the rule-based engine 18 may be given not only the captured image from the imaging unit 12 but also the inference result of the inference engine 17 to perform inference and obtain the inference result.

このような推論エンジン１７とルールベースエンジン１８との連携の関係については、推論モデル情報中の推論設定情報に記述されている。制御部１１には、推論設定部１１ｄが設けられており、推論設定部１１ｄは、推論設定記録領域１６ｂから推論設定情報を読み出して、推論エンジン１７及びルールベースエンジン１８を制御する。推論設定部１１ｄは、推論エンジン１７及びルールベースエンジン１８の少なくとも一方の推論結果を用いて総合判定を行う。 The linking relationship between the inference engine 17 and the rule base engine 18 is described in the inference setting information in the inference model information. An inference setting unit 11d is provided in the control unit 11, and the inference setting unit 11d reads inference setting information from the inference setting recording area 16b and controls the inference engine 17 and the rule base engine . The inference setting unit 11d uses the inference result of at least one of the inference engine 17 and the rule base engine 18 to make comprehensive determination.

本実施の形態においては、推論設定情報の元となる仕様情報は、学習依頼装置３０の後述する仕様設定部３１ａにより生成されるようになっている。また、本実施の形態においては、この仕様設定部３１ａと同一機能を有する仕様設定部１１ｇが制御部１１に設けられている。仕様設定部３１ａ，１１ｇは、推論エンジン１７に採用する推論モデルの生成に必要な推論モデル情報の仕様を記述した仕様情報を生成することができるようになっている。なお、仕様設定部１１ｇは、生成した仕様情報を通信部１５を介して学習依頼装置３０に送信する。 In the present embodiment, the specification information that is the basis of the inference setting information is generated by the specification setting unit 31a of the learning requesting device 30, which will be described later. Further, in the present embodiment, the control section 11 is provided with a specification setting section 11g having the same function as the specification setting section 31a. The specification setting units 31 a and 11 g can generate specification information describing the specifications of the inference model information required to generate the inference model adopted by the inference engine 17 . The specification setting unit 11g transmits the generated specification information to the learning request device 30 via the communication unit 15. FIG.

仕様設定部３１ａ，１１ｇは、仕様情報中の一部の情報を自動生成すると共に、一部の情報をユーザの入力操作に基づいて生成する。仕様設定部１１ｇは、表示制御部１１ｆを制御して、仕様情報の確認及び変更のための仕様設定メニュー表示を表示させることができるようになっている。なお、仕様設定部３１ａについても、学習依頼装置３０に設けられた図示しない表示部に、仕様情報の確認及び変更のための仕様設定メニュー表示を表示させることができるようになっている。 The specification setting units 31a and 11g automatically generate part of the information in the specification information and also generate part of the information based on the user's input operation. The specification setting section 11g can control the display control section 11f to display a specification setting menu display for checking and changing specification information. The specification setting unit 31a can also display a specification setting menu display for checking and changing specification information on a display unit (not shown) provided in the learning requesting device 30. FIG.

推論モデルを機械学習によって得る場合、学習時に与える教師データや学習用のコンピュータの性能と、実使用時の入力データと実使用時の電子回路の性能との差異を十分に考慮しなければ、思惑通りの推論結果を得ることが出来ない。したがって、推論モデル利用時の制約条件等を十分に考慮した、学習のステップが重要となる。このいわば、「推論モデルの要求仕様」を標準化できるようにしたのが、この図に示したような情報類である。様々な組織や個人が、いろいろな手段で学習の効率化、高性能化を行っているが、こうしたベースがないと、正しく機能する推論モデルを、実際の装置に搭載して性能を発揮させることは困難である。つまり、きわめて自由度の高い学習環境や、豊富な教師データ類を活用可能にして、利用環境下において、さらに高精度な出力が可能な推論モデルの取得を可能とするために、こうした要求仕様の明確な設定が重要となる。これらは手動入力したり、機器に搭載された記録部に記録された、あるいは外部から通信によって得られた情報によって、データ設定すればよい。このように、機械学習による推論モデルを有効に活用するためには細心の整合が重要で、図３は推論モデルを利用するハードウェア、装置との整合性を考慮して一覧にしたものの例である。したがって、ここに示した要求仕様は、いくつかの項目は必要でない場合もあり、さらに温湿度、気圧など環境の制約や回路の電源電圧やクロックの仕様等、項目を追加すべき状況もあることは言うまでもない。必要に応じて、この推論モデルを利用しない条件などを記載してもよい。
このように、本実施の形態では、外部に推論モデルの作成を依頼するための通信部を有し、表示部に搭載すべき推論モデルの要求仕様を一覧表示可能な推論モデル利用装置を提供可能であり、この推論モデル利用装置は、装置との整合性にすぐれた推論モデルを外部機器に依頼可能な高精度推論装置となる。 When an inference model is obtained by machine learning, it is necessary to fully consider the difference between the performance of the training computer and the training data given during learning, and the performance of the input data and the electronic circuit during actual use. I can't get the correct inference result. Therefore, it is important to have a learning step that fully considers the constraints when using the inference model. The kind of information shown in this figure is what makes it possible to standardize the so-called "required specifications of the inference model." Various organizations and individuals use various means to improve the efficiency and performance of learning. It is difficult. In other words, in order to make it possible to use a learning environment with an extremely high degree of freedom and a wealth of teacher data, and to make it possible to acquire an inference model that can output even more accurate output under the usage environment, we have to meet these required specifications. Clear settings are important. These data may be set by manual input, information recorded in a recording unit mounted on the device, or information obtained by communication from the outside. In this way, meticulous consistency is important for effective use of inference models based on machine learning. be. Therefore, some of the required specifications shown here may not be necessary, and there may be situations where additional items should be added, such as environmental constraints such as temperature, humidity, and atmospheric pressure, circuit power supply voltage, and clock specifications. Needless to say. If necessary, conditions for not using this inference model may be described.
Thus, in this embodiment, it is possible to provide an inference model utilization device that has a communication unit for requesting the creation of an inference model from the outside, and that can display a list of the required specifications of the inference model to be installed on the display unit. , and this inference model utilization device becomes a high-precision inference device that can request an inference model that is highly consistent with the device from an external device.

図３は表示部１４の表示画面１４ａ上に表示された仕様設定メニュー表示ＤＭの一例を示す説明図である。図３の例では、仕様情報中には、１５個の仕様項目が含まれるが、仕様項目は適宜設定可能である。図３の例は画像中の所定の検出対象を検出するための推論モデルを作成するための仕様情報の仕様項目を示している。 FIG. 3 is an explanatory diagram showing an example of the specification setting menu display DM displayed on the display screen 14a of the display section 14. As shown in FIG. In the example of FIG. 3, the specification information includes 15 specification items, but the specification items can be set as appropriate. The example of FIG. 3 shows specification items of specification information for creating an inference model for detecting a predetermined detection target in an image.

検出対象項目は、教師データに検出対象を示す検出対象情報がアノテーションとして付されていることを示す情報である。ハードウェア情報項目は、推論モデルを使用する機器に関する情報を示し、機器名、ネットワークの層数、クロック周波数及びメモリ容量を指定するものである。応答時間項目は、画像入力から推論出力までの時間を指定するものである。図３の例では、正解率、信頼性項目は、９０％以上の推論の信頼性を確保することを指定するものであり、入力画像サイズ、その他項目は、入力画像サイズがＦＨＤ（full high definition）で、画像処理として処理Ａを採用することを指定するものである。 The detection target item is information indicating that detection target information indicating a detection target is attached to the teacher data as an annotation. The hardware information item indicates information about the device using the inference model, and designates the device name, number of network layers, clock frequency and memory capacity. The response time item specifies the time from image input to inference output. In the example of FIG. 3, the accuracy rate and reliability items specify that the reliability of inference of 90% or more is ensured, and the input image size and other items indicate that the input image size is FHD (full high definition ) designates that the process A is adopted as the image process.

また、入力画像系統数項目は、入力画像が２系統切替えられることを指定するものであり、この指定によって、推論エンジン１７は、撮像部１２からの画像とルールベースエンジン１８からの画像とを入力可能である。つまり、この指定は、推論エンジン１７とルールベースエンジン１８の連携のために、ルールベースエンジン１８の出力を推論エンジン１７に与えることを可能にするものである。 Also, the input image system number item designates that the input image is switched between two systems. It is possible. In other words, this designation makes it possible to give the output of the rule base engine 18 to the inference engine 17 for cooperation between the inference engine 17 and the rule base engine 18 .

また、本実施の形態においては、併用検出器、補助情報項目が設定される。この項目は、併用して使用する併用推論機器を指定するものである。図１の例では、推論エンジン１７と連携して推論を行うルールベースエンジン１８が併用推論機器として指定される。図３の例ではルールベースエンジン１８は、仕様＊＊に従って、顔検出を行う機能を有する装置であることが分かる。また、補助情報項目は、ルールベースエンジン１８の出力を推論エンジン１７に用いる場合において、ルールベースエンジン１８のどのような出力を用いるかを補助情報として指定するものである。例えば、顔検出を行う場合において、ルールベースエンジン１８において検出された顔の座標や顔部の画像等を補助情報として指定して、推論エンジン１７の入力とすることを指定することができる。即ち、この指定は、ルールベースエンジン１８から推論エンジン１７への入力を規定するものである。 In addition, in the present embodiment, combined detectors and auxiliary information items are set. This item specifies the combined inference device to be used together. In the example of FIG. 1, the rule-based engine 18 that performs inference in cooperation with the inference engine 17 is designated as the combined inference device. In the example of FIG. 3, it can be seen that the rule-based engine 18 is a device having a function of performing face detection according to specifications**. The auxiliary information item designates what output of the rule base engine 18 is to be used as auxiliary information when the output of the rule base engine 18 is used in the inference engine 17 . For example, when performing face detection, it is possible to designate the coordinates of the face detected by the rule base engine 18, the image of the face, etc. as auxiliary information and to input the information to the inference engine 17. FIG. That is, this designation defines the input from the rule base engine 18 to the inference engine 17. FIG.

納入時期項目は、推論モデル情報の納入時期を指定するものである。教師データ、対象項目は、教師データの格納フォルダ及び検出対象（ファイル）を指定するものであり、図３の例ではＸフォルダ、検出対象Ｙであることを示している。その他データ利用項目は、入力が画像である場合画像以外のデータの利用を指定するものである。例えば、画像中の建物の検出に際して、画像の傾斜角度をその他のデータとして利用すること等を可能にするものである。テストサンプル項目は、推論モデルのテスト用のサンプルデータを指定するものであり、図３の例では提供しないことが分かる。発動条件、制限項目は、推論を発動する条件や、制限について指定するものであり、例えば、画像取得時であって、画像の輝度が所定値以下の場合には推論を実行しない等の指定が可能である。 The delivery date item specifies the delivery date of the inference model information. The teacher data and target items specify the storage folder and detection target (file) of the teacher data, and in the example of FIG. The other data use item specifies use of data other than an image when the input is an image. For example, when detecting a building in an image, it is possible to use the tilt angle of the image as other data. It can be seen that the test sample item specifies sample data for testing the inference model and is not provided in the example of FIG. Activation conditions and restriction items specify the conditions and restrictions for activating inference. For example, when acquiring an image, if the brightness of the image is below a predetermined value, the inference is not executed. It is possible.

補足情報項目は、補足的な情報を指定するものであり、例えば、検出した画像部分に枠を表示させたりテキストを表示させたりする指定が可能である。また、履歴情報項目は、以前の推論モデルの履歴を例えばバージョン情報によって指定するのである。 The supplementary information item specifies supplementary information. For example, it is possible to specify to display a frame or text on the detected image portion. Also, the history information item specifies the history of previous inference models, for example, by version information.

本実施の形態には、項目優先度項目が設定される。項目優先度項目は、仕様設定情報中のいずれの仕様項目を優先して推論モデルを生成すべきかを指定するものであり、図３の例では、検出対象項目、ハードウェア情報項目、正解率、信頼性項目、応答時間項目、入力画像サイズ、その他項目、入力画像系統数項目の順で、優先度が設定されていることを示している。 An item priority item is set in the present embodiment. The item priority item specifies which specification item in the specification setting information should be given priority to generate an inference model. In the example of FIG. It shows that the priority is set in the order of reliability item, response time item, input image size, other items, and input image system number item.

仕様設定部３１ａ，１１ｇは、これらの仕様項目のうちの所定の仕様項目を自動生成し、他の仕様項目をユーザ入力によって生成する。例えば、仕様設定部３１ａ，１１ｇは、撮像装置１０の仕様に基づいて所定の仕様項目を自動生成してもよい。例えば、仕様設定部１１ｇ，３１ａは、図示しないメモリに記憶された、仕様情報生成用のプログラムを実行することで、所定の仕様項目についての情報を自動生成するようになっていてもよい。また、仕様設定部１１ｇ，３１ａは、仕様情報生成用の図示しない推論エンジンを用いて、所定の仕様項目を自動生成するようになっていてもよい。 The specification setting units 31a and 11g automatically generate predetermined specification items among these specification items, and generate other specification items by user input. For example, the specification setting units 31 a and 11 g may automatically generate predetermined specification items based on the specifications of the imaging device 10 . For example, the specification setting units 11g and 31a may automatically generate information about predetermined specification items by executing a specification information generation program stored in a memory (not shown). Further, the specification setting units 11g and 31a may automatically generate predetermined specification items using an inference engine (not shown) for generating specification information.

仕様設定部１１ｇは、生成した仕様情報を学習依頼装置３０に送信する。学習依頼装置３０は、仕様情報に基づいて、学習装置２０に学習を依頼するものである。学習依頼装置３０は、仕様設定部３１ａによって生成した仕様情報又は仕様設定部１１ｇによって生成された仕様情報に基づいて学習装置２０に学習を依頼する。 The specification setting unit 11g transmits the generated specification information to the learning request device 30. FIG. The learning request device 30 requests learning to the learning device 20 based on the specification information. The learning requesting device 30 requests learning to the learning device 20 based on the specification information generated by the specification setting unit 31a or the specification information generated by the specification setting unit 11g.

学習依頼装置３０は、通信部３２を有しており、学習装置２０は通信部２２を有している。これらの通信部２２，３２は、通信部１５と同様の構成を有しており、通信部２２，３２，１５相互間において通信が可能である。 The learning requesting device 30 has a communication section 32 and the learning device 20 has a communication section 22 . These communication units 22 and 32 have the same configuration as the communication unit 15, and communication is possible between the communication units 22, 32 and 15. FIG.

学習依頼装置３０は、学習依頼装置３０の各部を制御する制御部３１を有しており、学習装置２０は学習装置２０の各部を制御する制御部２１を有している。制御部２１，３１は、ＣＰＵやＦＰＧＡ等を用いたプロセッサによって構成されていてもよく、図示しないメモリに記憶されたプログラムに従って動作して各部を制御するものであってもよいし、ハードウェアの電子回路で機能の一部又は全部を実現するものであってもよい。 The learning requesting device 30 has a control section 31 that controls each section of the learning requesting device 30 , and the learning device 20 has a control section 21 that controls each section of the learning device 20 . The control units 21 and 31 may be configured by a processor using a CPU, FPGA, or the like, may operate according to a program stored in a memory (not shown) to control each unit, or may be implemented by hardware. A part or all of the functions may be realized by an electronic circuit.

なお、学習部２０全体が、ＣＰＵ、ＧＰＵ、ＦＰＧＡ等を用いたプロセッサによって構成されて、図示しないメモリに記憶されたプログラムに従って動作して学習を制御するものであってもよいし、ハードウェアの電子回路で機能の一部又は全部を実現するものであってもよい。 Note that the learning unit 20 as a whole may be configured by a processor using a CPU, GPU, FPGA, or the like, and may operate according to a program stored in a memory (not shown) to control learning. A part or all of the functions may be realized by an electronic circuit.

学習依頼装置３０は、大量の学習用データを記録した画像分類記録部３３を有している。画像分類記録部３３は、ハードディスクやメモリ媒体等の図示しない記録媒体により構成されており、複数の画像を画像中に含まれる対象物の種類毎に分類して記録する。図９の例では、画像分類記録部３３は、対象物画像群３４を記憶しており、対象物画像群３４は、対象物の種類毎に教師データ３４ａ及びテストデータ３４ｂを含む。また、制御部３１は、仕様設定部１１ｇと同様の機能を有する仕様設定部３１ａを有する。学習依頼装置３０の制御部３１は、通信部３２を制御して、仕様情報及び教師データ３４ａを学習装置２０に送信する。 The learning requesting device 30 has an image classification recording unit 33 that records a large amount of learning data. The image classification recording unit 33 is composed of a recording medium (not shown) such as a hard disk or a memory medium, and classifies and records a plurality of images according to the types of objects included in the images. In the example of FIG. 9, the image classification recording unit 33 stores a target object image group 34, and the target object image group 34 includes teacher data 34a and test data 34b for each type of target object. The control section 31 also has a specification setting section 31a having the same function as the specification setting section 11g. The control unit 31 of the learning request device 30 controls the communication unit 32 to transmit the specification information and the teacher data 34a to the learning device 20. FIG.

なお、テストデータ３４ｂは教師データ３４ａに類似するデータであるが、学習装置２０には提供しないデータであり、学習依頼装置３０や撮像装置１０において、学習装置２０の学習の結果得られた推論モデルのテストに用いるものである。学習依頼装置３０はテストデータ３４ｂを撮像装置１０に送信することができる。撮像装置１０は学習依頼装置３０からのテストデータをテストデータ記録領域１６ｃに記録する。 Although the test data 34b is data similar to the teacher data 34a, it is data that is not provided to the learning device 20. In the learning requesting device 30 and the imaging device 10, an inference model obtained as a result of learning by the learning device 20 is used. It is used for the test of The learning requesting device 30 can transmit the test data 34b to the imaging device 10 . The imaging device 10 records the test data from the learning requesting device 30 in the test data recording area 16c.

学習部２０の母集合作成部２４は、学習依頼装置３０から送信された教師データを教師データ記録部２３に記録する。母集合作成部２４は、入力データ設定部２４ａ及び出力項目設定部２４ｂを有している。入力データ設定部２４ａは学習に用いる入力データを設定し、出力項目設定部２４ｂは推論の結果得られるべき出力を設定する。入力データ設定部２４ａ及び出力項目設定部２４ｂの設定は、学習依頼装置３０から受信した仕様情報に基づいて行われる。 The mother set creation unit 24 of the learning unit 20 records the teacher data transmitted from the learning request device 30 in the teacher data recording unit 23 . The mother set creating unit 24 has an input data setting unit 24a and an output item setting unit 24b. The input data setting unit 24a sets input data used for learning, and the output item setting unit 24b sets outputs to be obtained as a result of inference. The settings of the input data setting unit 24 a and the output item setting unit 24 b are performed based on the specification information received from the learning request device 30 .

入出力モデル化部２５は、大量の教師データにより期待される出力が得られるように、ネットワークデザインを決定し、その設定情報である推論モデル情報を生成する。入出力モデル化部２５には仕様照合部２５ａが設けられている。仕様照合部２５ａは、仕様情報を記憶する図示しないメモリを有しており、入出力モデル化部２５により求められた推論モデル情報が仕様情報に対応するものとなっているか否かを判定する。入出力モデル化部２５は、推論モデル情報が仕様情報に対応するものとなるまで、ネットワークデザインの構築を行う。 The input/output modeling unit 25 determines a network design so as to obtain an expected output from a large amount of teacher data, and generates inference model information as its setting information. The input/output modeling unit 25 is provided with a specification matching unit 25a. The specification matching unit 25a has a memory (not shown) for storing specification information, and determines whether or not the inference model information obtained by the input/output modeling unit 25 corresponds to the specification information. The input/output modeling unit 25 constructs the network design until the inference model information corresponds to the specification information.

本実施の形態においては、入出力モデル化部２５は、構築した推論モデルが、併用推論機器と連携して推論を行うものであるか否かの情報及び、併用推論機器と連携して推論を行うものである場合にはその連携の仕方についての情報を含む推論設定情報を生成し、生成した推論設定情報を推論モデル情報に付加するようになっている。例えば、入出力モデル化部２５は、推論モデル情報のヘッダ情報に推論設定情報を付加してもよい。例えば、入出力モデル化部２５は、推論モデル情報のヘッダ情報として、ネットワークの構造を示す情報、ウェイト及びバイアスの情報、連携する併用推論機器の情報、併用推論機器との連携の仕方の情報等を含む情報を生成する。 In the present embodiment, the input/output modeling unit 25 includes information as to whether or not the constructed inference model performs inference in cooperation with the combined inference device, If it is to be performed, inference setting information including information on how to link it is generated, and the generated inference setting information is added to the inference model information. For example, the input/output modeling unit 25 may add the inference setting information to the header information of the inference model information. For example, the input/output modeling unit 25 uses, as header information of the inference model information, information indicating the network structure, information on weights and biases, information on cooperating inference devices, information on how to cooperate with cooperating inference devices, and the like. Generate information containing

なお、入出力モデル化部２５は、ヘッダ情報ではなく、推論モデル情報にメタ情報として推論設定情報を付加してもよく、また、推論モデル情報とは別データの推論設定情報を推論モデル情報に関連付けるようにしてもよい。また、入出力モデル化部２５は、推論設定情報として、仕様情報をそのまま推論モデル情報に付加してもよい。 The input/output modeling unit 25 may add the inference setting information as meta information to the inference model information instead of the header information. You may make it associate. Further, the input/output modeling unit 25 may add the specification information as it is to the inference model information as the inference setting information.

入出力モデル化部２５は、生成した推論設定情報を含む推論モデル情報を、通信部２２を介して撮像装置１０の制御部１１に送信する。制御部１１は、上述したように、推論モデル情報を推論設定記録領域１６ｂに記録させる。推論設定部１１ｄは、推論設定情報に基づいて、推論エンジン１７及びルールベースエンジン１８の動作を制御することにより、推論エンジン１７及びルールベースエンジン１８の推論を利用した推論結果を得ることを可能にする。 The input/output modeling unit 25 transmits the inference model information including the generated inference setting information to the control unit 11 of the imaging device 10 via the communication unit 22 . As described above, the control unit 11 records the inference model information in the inference setting recording area 16b. The inference setting unit 11d controls the operations of the inference engine 17 and the rule base engine 18 based on the inference setting information, thereby making it possible to obtain an inference result using the inference of the inference engine 17 and the rule base engine 18. do.

次に、このように構成された実施の形態の動作について図４から図１７を参照して説明する。図４から図６は推論エンジンと併用推論機器との連携の仕方の例を説明するための説明図である。 Next, the operation of the embodiment configured in this way will be described with reference to FIGS. 4 to 17. FIG. FIGS. 4 to 6 are explanatory diagrams for explaining an example of how the inference engine and the combined inference device are linked.

推論エンジン１７とルールベースエンジン１８が連携して推論する手法として、図４から図６に示す３つの手法について説明する。図４から図６の例は、入力画像中から所定の検出対象を検出する推論を行うものである。 Three methods shown in FIGS. 4 to 6 will be described as methods for inference by the inference engine 17 and the rule base engine 18 in cooperation. In the examples of FIGS. 4 to 6, inference is performed to detect a predetermined detection target from an input image.

図４の例は、推論エンジン１７による推論と併用推論機器であるルールベースエンジン１８による推論とを相互に独立して実行させる手法（以下、第１手法という）を示している。入力画像は推論エンジン１７及びルールベースエンジン１８の両方に与えられ、推論エンジン１７及びルールベースエンジン１８それぞれ独立して推論を行う。推論エンジン１７からは推論結果として検出対象の画像上の位置情報及びその信頼性が出力され、ルールベースエンジン１８からも推論結果として検出対象の画像上の位置情報及びその信頼性が出力される。制御部１１は、信頼性の情報に基づいて位置情報を選択して、検出対象を決定する。 The example of FIG. 4 shows a technique (hereinafter referred to as the first technique) in which the inference by the inference engine 17 and the inference by the rule base engine 18, which is a combined inference device, are executed independently of each other. An input image is given to both the inference engine 17 and the rule base engine 18, and the inference engine 17 and the rule base engine 18 perform inference independently. The inference engine 17 outputs the position information of the detection target on the image and its reliability as an inference result, and the rule base engine 18 also outputs the position information of the detection target on the image and its reliability as an inference result. The control unit 11 selects the position information based on the reliability information and determines the detection target.

図５は推論エンジン１７による推論を優先しつつ、推論エンジン１７による推論結果を併用推論機器であるルールベースエンジン１８に与えて推論精度を向上させる手法（以下、第２手法という）を示している。入力画像は先ず推論エンジン１７に与えられる。推論エンジン１７は推論結果を制御部１１に出力すると共に、例えば十分な信頼度の推論結果が得られない場合において推論結果あるいは推論不能の判定結果をルールベースエンジン１８に出力する。この結果を受けて、ルールベースエンジン１８は、推論を行い推論結果を制御部１１に出力する。制御部１１は、信頼性の情報に基づいて位置情報を選択して、検出対象を決定する。 FIG. 5 shows a method (hereinafter referred to as the second method) of giving priority to the inference by the inference engine 17 and giving the inference result of the inference engine 17 to the rule base engine 18, which is a combined inference device, to improve the inference accuracy. . An input image is first provided to the inference engine 17 . The inference engine 17 outputs an inference result to the control unit 11, and outputs an inference result or a determination result of inference impossibility to the rule base engine 18, for example, when an inference result with sufficient reliability cannot be obtained. Upon receiving this result, the rule base engine 18 makes an inference and outputs the inference result to the control unit 11 . The control unit 11 selects the position information based on the reliability information and determines the detection target.

図６は推論エンジン１７による推論を優先しつつ、併用推論機器であるルールベースエンジン１８の推論結果を推論エンジン１７に与えて、推論エンジン１７よる推論度を向上させる手法（以下、第３手法という）を示している。入力画像は推論エンジン１７及び併用推論機器であるルールベースエンジン１８の両方に与えられる。ルールベースエンジン１８は推論結果を制御部１１に出力すると共に、例えば推論エンジン１７の推論精度を向上させるための補助情報を推論エンジン１７に出力する。例えば、ルールベースエンジン１８は、検出対象の顔を検出しその顔の座標や顔画像等を推論エンジン１７に補助情報として出力する。この結果を受けて、推論エンジン１７は、推論を行い推論結果を制御部１１に出力する。制御部１１は、信頼性の情報に基づいて位置情報を選択して、検出対象を決定する。なお、連携の仕方は上記第１～第３手法に限定されるものではない。例えば、併用推論機器の推論結果を優先して用いるようにしてもよい。 FIG. 6 shows a method of giving priority to the inference by the inference engine 17 and giving the inference result of the rule-based engine 18, which is a combined inference device, to the inference engine 17 to improve the degree of inference by the inference engine 17 (hereinafter referred to as the third method). ). The input image is provided to both the inference engine 17 and a companion inference appliance, the rule-based engine 18 . The rule base engine 18 outputs an inference result to the control unit 11 and outputs auxiliary information to the inference engine 17 for improving the inference accuracy of the inference engine 17, for example. For example, the rule base engine 18 detects a face to be detected and outputs the coordinates of the face, the face image, etc. to the inference engine 17 as auxiliary information. Upon receiving this result, the inference engine 17 performs inference and outputs the inference result to the control unit 11 . The control unit 11 selects the position information based on the reliability information and determines the detection target. It should be noted that the method of cooperation is not limited to the above first to third methods. For example, the inference result of the combined inference device may be preferentially used.

次に、第１～第３手法の具体例について説明する。 Next, specific examples of the first to third techniques will be described.

図７は撮像装置１０による撮影の様子を示す説明図である。本実施の形態における撮像装置１０は、図７に示す筐体１０ａ中に図１中の各回路が収納されており、筐体１０ａの背面に表示部１４の表示画面１４ａが設けられている。ユーザ４１は、例えば、右手４２で筐体１０ａを把持して、表示部１４の表示画面１４ａを見ながら被写体を視野範囲に捉えた状態で撮影を行う。筐体１０ａの上面には、操作部１３を構成するレリーズスイッチ４３が設けられている。図７の例では、被写体はブロック塀４５の上にいる猫４６である。ユーザ４１は、右手４２の人差し指４２ａでレリーズスイッチ４３を押下することで撮影を行う。 FIG. 7 is an explanatory diagram showing how the imaging device 10 captures an image. The imaging apparatus 10 according to the present embodiment includes each circuit shown in FIG. 1 housed in a housing 10a shown in FIG. 7, and a display screen 14a of the display unit 14 is provided on the rear surface of the housing 10a. For example, the user 41 holds the housing 10a with the right hand 42, and captures the subject within the visual field range while looking at the display screen 14a of the display unit 14. FIG. A release switch 43 constituting the operation unit 13 is provided on the upper surface of the housing 10a. In the example of FIG. 7, the subject is a cat 46 on top of a block wall 45 . The user 41 takes a picture by pressing the release switch 43 with the index finger 42a of the right hand 42. FIG.

図７の例において、撮像装置１０の推論エンジン１７及びルールベースエンジン１８は、推論により撮像画像中から猫１０を検出するように構成されるものとする。ルールベースエンジン１８は、例えば、公知の手法により、猫の顔検出が可能となるように構成される。 In the example of FIG. 7, the inference engine 17 and the rule base engine 18 of the imaging device 10 are configured to detect the cat 10 in the captured image by inference. The rule-based engine 18 is configured, for example, to enable detection of a cat's face by a known method.

（第１手法）
図８は学習依頼装置３０の動作を説明するためのフローチャートであり、図９は学習装置２０の動作を説明するためのフローチャートである。 (First method)
FIG. 8 is a flow chart for explaining the operation of the learning requesting device 30, and FIG. 9 is a flow chart for explaining the operation of the learning device 20. As shown in FIG.

学習依頼装置３０の制御部３１は、図８のステップＳ１において、推論エンジン１７に設定する推論モデルの要求仕様の設定を行うか否かを判定する。要求仕様の設定を行わない場合には、制御部３１は、ステップＳ２において、教師データフォルダの作成が指示されているか否かを判定し、指示されている場合には、ステップＳ３において教師データフォルダを作成する。即ち、制御部３１は、猫検出の推論のための教師データとなる画像の収集及びアノテーション付与を行う。収集された教師データは、画像分類記録部３３の対象物画像群３４に教師データ３４ａとして記録される。また、対象物画像群３４にはテストデータ３４ｂも記録される。 The control unit 31 of the learning requesting device 30 determines whether or not to set the required specification of the inference model to be set in the inference engine 17 in step S1 of FIG. If the required specifications are not set, the control unit 31 determines in step S2 whether or not creation of a teacher data folder has been instructed, and if instructed, creates a teacher data folder in step S3. do. That is, the control unit 31 collects and annotates images that serve as teacher data for inference for cat detection. The collected training data is recorded in the object image group 34 of the image classification recording section 33 as training data 34a. Test data 34b is also recorded in the object image group 34. FIG.

制御部３１は、要求仕様の設定を行う場合には、ステップＳ１からステップＳ４に移行して、仕様設定部３１ａにより図３に示した仕様項目の設定を行う。上述したように、仕様設定部３１ａは、仕様項目の一部については自動生成し、他の一部についてはユーザの入力操作に基づいて作成する。制御部３１は、ステップＳ５において、全ての仕様項目の入力が終了したか否かを判定し、終了していない場合には処理をステップＳ１に戻し、終了している場合には、設定した仕様項目による仕様情報及び教師データを学習装置２０に送信して学習を依頼する。 When setting the required specifications, the control unit 31 proceeds from step S1 to step S4 and sets the specification items shown in FIG. 3 by the specification setting unit 31a. As described above, the specification setting unit 31a automatically generates some of the specification items, and creates others based on the user's input operation. In step S5, the control unit 31 determines whether or not all specification items have been input. If not, the process returns to step S1. Specification information by item and teacher data are sent to the learning device 20 to request learning.

図１０は、第１手法における学習装置２０による学習及び学習の結果得られる推論モデルを説明するための説明図である。学習装置２０の制御部２１は、図９のステップＳ１１において、学習依頼の待機状態である。制御部２１は、学習依頼が発生すると、処理をステップＳ１２に移行して学習依頼装置３０から仕様情報を受信して要求された仕様を取得する。また、制御部２１は、学習依頼装置３０から教師データを取得して教師データ記録部２３に記録する（ステップＳ１３）。制御部２１は、仕様及び必要性に応じて、教師データを追加記録する（ステップＳ１４）。母集合作成部２４は、仕様情報に基づく要求仕様に従って、入力データ設定部２４ａ及び出力項目設定部２４ｂによって、入力及び出力を設定する。入出力モデル化部２５は、設定された入力及び出力に応じて教師データに基づく学習を行う。 FIG. 10 is an explanatory diagram for explaining learning by the learning device 20 in the first method and an inference model obtained as a result of the learning. The control unit 21 of the learning device 20 is in a waiting state for a learning request in step S11 of FIG. When the learning request is generated, the control unit 21 shifts the process to step S12, receives the specification information from the learning request device 30, and acquires the requested specification. Further, the control unit 21 acquires teacher data from the learning request device 30 and records it in the teacher data recording unit 23 (step S13). The control unit 21 additionally records teacher data according to specifications and necessity (step S14). The mother set creating unit 24 sets inputs and outputs using the input data setting unit 24a and the output item setting unit 24b according to the required specifications based on the specification information. The input/output modeling unit 25 performs learning based on teacher data according to the set inputs and outputs.

図１０に示すように、入出力モデル化部２５により、所定のネットワークＮ１には入力及び出力に対応する大量の画像が教師データとして与えられる。図１０の例では、教師データとして、猫の画像であって、顔を横から見た猫の画像が入力される。この場合の猫の画像には、顔部分を破線枠で囲むようにアノテーションが設定されている。なお、顔の向きは、撮像装置１０を基準にして説明しており、以下の説明では、横から見た顔を横向きの顔ということもある。即ち、横向きの顔は、猫が顔を撮像装置１０の光軸方向に向けていないことをいう。また、後述する正面向きの顔とは、顔が撮像装置１０の光軸方向に向いていることをいう。 As shown in FIG. 10, the input/output modeling unit 25 provides a predetermined network N1 with a large number of images corresponding to inputs and outputs as teacher data. In the example of FIG. 10, an image of a cat whose face is viewed from the side is input as teacher data. In this case, the image of the cat is annotated so that the face is surrounded by a dashed frame. The orientation of the face is described based on the imaging device 10, and in the following description, the face viewed from the side may also be referred to as the sideways face. That is, the sideways face means that the cat does not turn its face toward the optical axis direction of the imaging device 10 . Further, a front-facing face, which will be described later, means that the face faces the optical axis direction of the imaging device 10 .

大量の教師データによる学習を行うことで、ネットワークＮ１は、入力に対応する出力が得られるように、ネットワークデザインが決定される。即ち、図１０の例では、猫の画像が入力されると、猫の顔が横を向いている場合にはその顔の画像部分を囲む位置に破線枠を表示するための情報が信頼度の情報と共に得られる。即ち、図１０の例では、横向きの顔の猫の画像を検出する推論モデルが構築される。 By performing learning using a large amount of teacher data, the network design is determined so that the network N1 can obtain an output corresponding to the input. That is, in the example of FIG. 10, when an image of a cat is input, if the cat's face is turned sideways, information for displaying a dashed frame at a position surrounding the image portion of the face is the reliability level. obtained with information. That is, in the example of FIG. 10, an inference model for detecting an image of a cat with a sideways face is constructed.

なお、深層学習（ディープ・ラーニング）」は、ニューラル・ネットワークを用いた「機械学習」の過程を多層構造化したものである。情報を前から後ろに送って判定を行う「順伝搬型ニューラル・ネットワーク」が代表的なものである。これは、最も単純なものでは、Ｎ１個のニューロンで構成される入力層、パラメータで与えられるＮ２個のニューロンで構成される中間層、判別するクラスの数に対応するＮ３個のニューロンで構成される出力層の３層があればよい。そして、入力層と中間層、中間層と出力層の各ニューロンはそれぞれが結合加重で結ばれ、中間層と出力層はバイアス値が加えられることで、論理ゲートの形成が容易である。簡単な判別なら３層でもよいが、中間層を多数にすれば、機械学習の過程において複数の特徴量の組み合わせ方を学習することも可能となる。近年では、９層～１５２層のものが、学習にかかる時間や判定精度、消費エネルギーの関係から実用的になっている。 “Deep learning” is a multi-layered process of “machine learning” using neural networks. A typical example is a "forward propagation neural network" that sends information from front to back and makes decisions. In the simplest case, it consists of an input layer consisting of N1 neurons, an intermediate layer consisting of N2 neurons given by parameters, and N3 neurons corresponding to the number of classes to be discriminated. It suffices if there are three output layers. The neurons of the input layer and the intermediate layer, and the neurons of the intermediate layer and the output layer are respectively connected by a connection weight, and the intermediate layer and the output layer are added with a bias value, thereby facilitating the formation of logic gates. Three layers may be sufficient for simple discrimination, but if a large number of intermediate layers are used, it becomes possible to learn how to combine a plurality of feature quantities in the process of machine learning. In recent years, those with 9 to 152 layers have become practical due to the relationship between the time required for learning, judgment accuracy, and energy consumption.

機械学習に採用するネットワークＮ１としては、公知の種々のネットワークを採用してもよい。例えば、ＣＮＮ（Convolution Neural Network）を利用したＲ－ＣＮＮ（Regions with CNN features）やＦＣＮ（Fully Convolutional Networks）等を用いてもよい。これは、画像の特徴量を圧縮する、「畳み込み」と呼ばれる処理を伴い、最小限処理で動き、パターン認識に強い。また、より複雑な情報を扱え、順番や順序によって意味合いが変わる情報分析に対応して、情報を双方向に流れる「再帰型ニューラル・ネットワーク」（全結合リカレントニューラルネット）を利用してもよい。 Various known networks may be employed as the network N1 employed for machine learning. For example, R-CNN (Regions with CNN features) using CNN (Convolution Neural Network) or FCN (Fully Convolutional Networks) may be used. It involves a process called "convolution" that compresses image features, works with minimal processing, and is robust to pattern recognition. In addition, a "recurrent neural network" (fully-connected recurrent neural network), which can handle more complicated information and can handle information analysis whose meaning changes depending on the order and order, may be used in which information flows in both directions.

これらの技術の実現のためには、ＣＰＵやＦＰＧＡ（Field Programmable Gate Array）といったこれまでの汎用的な演算処理回路などを使ってもよいが、ニューラル・ネットワークの処理の多くが行列の掛け算であることから、行列計算に特化したＧＰＵ（Graphic Processing Unit）やTensor Processing Unit（TPU）と呼ばれるものが利用される場合もある。近年ではこうした人工知能（ＡＩ）専用ハードの「ニューラル・ネットワーク・プロセッシング・ユニット（ＮＰＵ）」がＣＰＵなどその他の回路とともに集積して組み込み可能に設計され、処理回路の一部になっている場合もある。 In order to realize these technologies, conventional general-purpose arithmetic processing circuits such as CPUs and FPGAs (Field Programmable Gate Arrays) may be used, but much of the neural network processing is matrix multiplication. For this reason, GPUs (Graphic Processing Units) and Tensor Processing Units (TPUs) that are specialized for matrix calculations are sometimes used. In recent years, artificial intelligence (AI) dedicated hardware "neural network processing unit (NPU)" is designed to be integrated and embedded with other circuits such as CPU, and it may be part of the processing circuit. be.

また、深層学習に限らず、公知の各種機械学習の手法を採用して推論モデルを取得してもよい。例えば、サポートベクトルマシン、サポートベクトル回帰という手法もある。ここでの学習は、識別器の重み、フィルター係数、オフセットを算出するもので、他には、ロジスティック回帰処理を利用する手法もある。機械に何かを判定させる場合、人間が機械に判定の仕方を教える必要があり、今回の実施例では、画像の判定を、機械学習により導出する手法を採用したが、そのほか、特定の判断を人間が経験則・ヒューリスティクスによって獲得したルールを適応するルールベースの手法を応用して用いてもよい。 In addition, the inference model may be acquired by employing not only deep learning but also various known machine learning techniques. For example, there are methods such as support vector machines and support vector regression. The learning here is to calculate the classifier weights, filter coefficients, and offsets, and there is also a method using logistic regression processing. When a machine judges something, a human needs to teach the machine how to judge something. A rule-based technique that applies rules acquired by humans through empirical rules and heuristics may be applied and used.

入出力モデル化部２５は、ステップＳ１６において、生成した推論モデルが必要仕様を満たしているか否かを判定する。満たしていない場合には、入出力モデル化部２５は、次のステップＳ１７において、仕様が仕様情報中の優先度で指定された条件を満足しているか否かを判定する。例えは、項目優先度が図３に示すものである場合には、入出力モデル化部２５は、生成した推論モデルが、検出対象項目、ハードウェア情報項目、正解率、信頼性項目、応答時間項目、入力画像サイズ、その他項目及び入力画像系統数項目の各項目の仕様を満足しているか否かを判定する。満足していている場合には、入出力モデル化部２５は、教師データの再設定等を行って（ステップＳ１８）、所定回数以上に到達したことを判定（ステップＳ１９）した後、所定回数未満である場合には、処理をステップＳ１５に戻して、推論モデル化を繰り返す。 In step S16, the input/output modeling unit 25 determines whether the generated inference model satisfies the required specifications. If not satisfied, in the next step S17, the input/output modeling unit 25 determines whether or not the specifications satisfy the conditions specified by the priority in the specification information. For example, if the item priorities are as shown in FIG. It is determined whether or not the specifications of each item of the item, input image size, other items, and input image system number item are satisfied. If satisfied, the input/output modeling unit 25 resets the teacher data (step S18), determines that the predetermined number of times or more has been reached (step S19), If so, the process returns to step S15 to repeat the inference modeling.

入出力モデル化部２５は、ステップＳ１７において、生成した推論モデルが優先度で指定された条件を満足していないと判定した場合、又は、ステップＳ１９において、推論モデル化を所定回数以上繰り返したと判定した場合には、処理をステップＳ２０に移行して、有効な推論モデルを構築することが苦手な画像であることを示す苦手画像情報を学習依頼装置３０に対して送信する。 If the input/output modeling unit 25 determines in step S17 that the generated inference model does not satisfy the condition specified by the priority, or in step S19, determines that inference modeling has been repeated a predetermined number of times or more. If so, the process proceeds to step S20, and weak image information indicating that the image is difficult to construct an effective inference model is transmitted to the learning requesting device 30. FIG.

入出力モデル化部２５は、ステップＳ１６において、生成した推論モデルが必要仕様を満たしたと判定した場合には、処理をステップＳ２１に移行して、学習依頼装置３０及び必要に応じて指定された機器（例えば撮像装置１０）に対して推論モデル情報を送信する。 When the input/output modeling unit 25 determines in step S16 that the generated inference model satisfies the required specifications, the process proceeds to step S21, and the learning request device 30 and the device designated as necessary Inference model information is transmitted to (for example, the imaging device 10).

本実施の形態においては、この送信に際して、入出力モデル化部２５は、推論モデル情報に、仕様情報に基づいて生成した推論設定情報を付加するようになっている。推論設定情報には、図１０の例では、推論エンジンによる推論と併用推論機器による推論とを相互に独立して実行させ、各推論の信頼度等に基づいて推論結果を得る、即ち、第１手法による検出を行うことを示す情報が含まれる。 In this embodiment, the input/output modeling unit 25 adds the inference setting information generated based on the specification information to the inference model information at the time of this transmission. In the example of FIG. 10, the inference setting information includes inference by the inference engine and inference by the combined inference device that are executed independently of each other, and inference results are obtained based on the reliability of each inference. It contains information indicating that detection by the method is to be performed.

学習依頼装置３０の制御部３１は、推論モデル情報を受信すると、図８のステップＳ７において、テストデータ３４ｂを用いて、学習結果の良否判定を行う。制御部３１は、テストが良好である（ＯＫ）の判定を行った場合には、学習装置２０に対して、テスト対象の推論モデル情報を撮像装置１０に送信するように依頼し、テストが不良（ＮＧ）の判定を行った場合には、再学習を学習装置２０に依頼する。なお、この場合には、要求仕様の追加や修正を行ってもよい。なお、学習依頼装置３０は、ＯＫ判定した推論モデル情報を撮像装置１０に直接送信してもよい。 Upon receiving the inference model information, the control unit 31 of the learning requesting device 30 uses the test data 34b to determine the quality of the learning result in step S7 of FIG. When determining that the test is good (OK), the control unit 31 requests the learning device 20 to transmit the inference model information to be tested to the imaging device 10, and determines that the test is bad. If the determination is (NG), the learning device 20 is requested to re-learn. In this case, the required specifications may be added or modified. Note that the learning requesting device 30 may directly transmit the OK-determined inference model information to the imaging device 10 .

図１１は撮像装置１０の動作を説明するためのフローチャートである。 FIG. 11 is a flowchart for explaining the operation of the imaging device 10. FIG.

撮像装置１０の制御部１１は、図１１のステップＳ３１において、撮影モードが指定されているか否かを判定する。撮影モードが指定されていない場合には、制御部１１は、ステップＳ３２において推論モデルの取得が指示されているか否かを判定する。推論モデル取得が指示されていない場合には処理をステップＳ３１に戻し、指示されている場合には、次のステップＳ３３において、推論モデルの仕様を確認する。即ち、制御部１１は、学習装置２０又は学習依頼装置３０から推論モデル情報を受信し、受信した推論モデル情報に基づいて、推論エンジン１７のネットワーク１７ａを構築する。こうして、ネットワーク１７ａには、図１０のＮ１と同様の推論モデルが構築される。推論設定部１１ｄは、推論モデル情報に付加されている推論設定情報を読み出して、内蔵されているルールベースエンジン１８の機能を考慮して、必要な仕様を満足しているか否かの確認を行う。 The control unit 11 of the imaging device 10 determines whether or not the shooting mode is specified in step S31 of FIG. If the shooting mode is not specified, the control unit 11 determines whether or not acquisition of the inference model is instructed in step S32. If the inference model acquisition is not instructed, the process returns to step S31, and if it is instructed, the specifications of the inference model are confirmed in the next step S33. That is, the control unit 11 receives the inference model information from the learning device 20 or the learning request device 30, and constructs the network 17a of the inference engines 17 based on the received inference model information. In this way, an inference model similar to N1 in FIG. 10 is constructed in the network 17a. The inference setting unit 11d reads the inference setting information added to the inference model information, considers the function of the built-in rule base engine 18, and confirms whether or not the necessary specifications are satisfied. .

制御部１１は、ステップＳ３４において、仕様確認の結果が良好（ＯＫ）であるか否かを判定する。良好でない場合には、制御部１１は、ステップＳ３８において、要求仕様の再設定を要求するための情報を学習装置２０又は学習依頼装置３０に送信する。仕様確認の結果が良好な場合には、制御部１１は、テストデータ記録領域１６ｃからテストデータを読み出して、推論のテストを実行する。制御部１１は、テストの結果、十分な信頼性が得られたか否かを判定し、十分な信頼性が得られた場合には取得した推論モデル情報を確定し（ステップＳ３７）、得られない場合には、ステップＳ３８において、要求仕様の再設定を要求する。 In step S34, the control unit 11 determines whether or not the result of the specification confirmation is good (OK). If not, the control unit 11 transmits information for requesting resetting of the required specifications to the learning device 20 or the learning requesting device 30 in step S38. If the result of the specification confirmation is good, the control section 11 reads the test data from the test data recording area 16c and executes the inference test. As a result of the test, the control unit 11 determines whether or not sufficient reliability has been obtained. If so, in step S38, a request is made to reset the required specifications.

制御部１１は、撮影モードが指定されている場合には、ステップＳ３１からステップＳ４１に処理を移行して撮像画像を取り込む。撮像部１２からの撮像画像は制御部１１に与えられると共に、推論エンジン１７及びルールベースエンジン１８にも与えられる。制御部１１の推論設定部１１ｄは、推論設定情報に基づいて、第１手法により、推論エンジン１７及びルールベースエンジン１８の双方に独立して推論を行うように指示する。 When the photographing mode is designated, the control unit 11 shifts the process from step S31 to step S41 and captures the captured image. A captured image from the imaging unit 12 is provided to the control unit 11 and also to the inference engine 17 and the rule base engine 18 . Based on the inference setting information, the inference setting unit 11d of the control unit 11 instructs both the inference engine 17 and the rule base engine 18 to perform inference independently by the first method.

図７に示すようにユーザ４１は撮影を行う。制御部１１の表示制御部１１ｆは、撮像部１２からの撮像画像を表示部１４に与えてライブビュー画像を表示させる。また、推論エンジン１７及びルールベースエンジン１８は、順次入力される撮像画像に対して推論を実施する（ステップＳ４２）。 As shown in FIG. 7, the user 41 takes a picture. The display control unit 11f of the control unit 11 gives the captured image from the imaging unit 12 to the display unit 14 to display a live view image. Also, the inference engine 17 and the rule base engine 18 perform inference on the captured images that are sequentially input (step S42).

図１２はルールベースエンジン１８による推論を説明するための説明図であり、図１３は推論エンジン１７による推論を説明するための説明図である。 FIG. 12 is an explanatory diagram for explaining inference by the rule base engine 18, and FIG. 13 is an explanatory diagram for explaining inference by the inference engine 17. FIG.

いま、ある瞬間に図１２に示す撮像画像ＰＩ１がルールベースエンジン１８に入力されるものとする。ルールベースエンジン１８は、公知の顔検出処理によって、猫の正面向きの顔の部分を検出する。ルールベースエンジン１８は、検出結果として、猫の顔の部分を囲う実線枠ＰＯ１ａの情報を出力する。この情報は、ルールベースエンジン１８から制御部１１に供給される。表示制御部１１ｆは、ルールベースエンジン１８の検出結果に基づいて、ライブビュー画像上に実線枠ＰＯ１ａを重畳した表示画像ＰＯ１を表示する。 Assume that the captured image PI1 shown in FIG. 12 is input to the rule base engine 18 at a certain moment. The rule-based engine 18 detects the part of the cat's front facing face by a known face detection process. The rule-based engine 18 outputs the information of the solid-line frame PO1a surrounding the cat's face as the detection result. This information is supplied from the rule base engine 18 to the control unit 11 . Based on the detection result of the rule base engine 18, the display control unit 11f displays the display image PO1 in which the solid line frame PO1a is superimposed on the live view image.

また、ある瞬間に図１３に示す撮像画像ＰＩ２が推論エンジン１７に入力されるものとする。推論エンジン１７は、学習装置２０又は学習依頼装置３０から提供を受けた推論モデル情報に基づくネットワーク１７ａにより、猫の横向きの顔の部分を検出する。推論エンジン１７は、検出結果として、猫の横向きの顔の部分を囲う破線枠ＰＯ２ａの情報を出力する。この情報は、推論エンジン１７から制御部１１に供給される。表示制御部１１ｆは、推論エンジン１７の検出結果に基づいて、ライブビュー画像上に破線枠ＰＯ２ａを重畳した表示画像ＰＯ２を表示する。 13 is input to the inference engine 17 at a certain moment. The inference engine 17 uses the network 17a based on the inference model information provided by the learning device 20 or the learning requesting device 30 to detect the part of the cat's sideways face. The inference engine 17 outputs, as a detection result, the information of the broken-line frame PO2a surrounding the sideways face of the cat. This information is supplied from the inference engine 17 to the control unit 11 . Based on the detection result of the inference engine 17, the display control unit 11f displays a display image PO2 in which a dashed frame PO2a is superimposed on the live view image.

図１４は一連の推論の検出結果に基づく画像表示の例を示す説明図である。図１４は表示部１４の表示画面１４ａ上に表示されるライブビュー画像を示しており、図７の撮影の様子に示すようにユーザ４１がブロック塀４５上の猫４６の撮影を試みる状態において、時間の経過と共に順次撮像されてライブビュー表示される画像のうちの画像Ｐ１～Ｐ６を示している。画像Ｐ１は、比較的広い範囲を撮像して得られたものであり、ブロック塀４５上の猫４６の画像を含む。制御部１１の表示制御部１１ｆは、ステップＳ４３において、推論エンジン１７及びルールベースエンジン１８を動作させて猫の画像の検出を行っていることを示す「発動」の表示Ｐ１ａを表示画面１４ａ上に表示させる。 FIG. 14 is an explanatory diagram showing an example of image display based on a series of inference detection results. FIG. 14 shows a live view image displayed on the display screen 14a of the display unit 14. As shown in FIG. Images P1 to P6 are shown among the images that are sequentially captured over time and displayed in live view. Image P1 is obtained by imaging a relatively wide range, and includes an image of cat 46 on block wall 45 . In step S43, the display control unit 11f of the control unit 11 causes the inference engine 17 and the rule base engine 18 to operate to display a “activation” display P1a on the display screen 14a, indicating that the cat image is being detected. display.

ここで、視野範囲内に主に猫のみが撮影されるように、ユーザ４１がズーム操作を行うものとする。画像Ｐ２以降の画像はこの状態でのライブビュー画像を示している。画像Ｐ２は猫の顔が正面向きで撮像されたものである。この場合には、ルールベースエンジン１８により猫の顔が検出されて、顔の周囲を囲む実線枠の情報が制御部１１に供給される。制御部１１は、検出が行われたことを判定すると、推論エンジン１７による検出が利用されたかルールベースエンジン１８による検出が利用されたかを判定する（ステップＳ４５）。例えば、制御部１１は、推論エンジン１７及びルールベースエンジン１８から出力される信頼度によって、この判定を行ってもよい。画像Ｐ２ではルールベースエンジン１８の検出結果が採用されるので、処理をステップＳ４６に移行して、表示制御部１１ｆは、ライブビュー画像上に実線枠表示Ｐ２ｂを表示する。また、表示制御部１１ｆは、ルールベースエンジン１８が発動されて猫の顔の検出が行われたことを示す「Ｒ発動」の表示Ｐ２ａ及び「顔検出」の文字を表示する。 Here, it is assumed that the user 41 performs a zoom operation so that mainly only cats are photographed within the field of view. Images after the image P2 show live view images in this state. Image P2 is an image of the cat's face facing forward. In this case, the rule base engine 18 detects the cat's face, and supplies the control section 11 with the information of the solid line frame surrounding the face. When determining that the detection has been performed, the control unit 11 determines whether the detection by the inference engine 17 is used or the detection by the rule base engine 18 is used (step S45). For example, the control unit 11 may make this determination based on the reliability output from the inference engine 17 and rule base engine 18 . Since the detection result of the rule base engine 18 is adopted for the image P2, the process proceeds to step S46, and the display control unit 11f displays a solid line frame display P2b on the live view image. In addition, the display control unit 11f displays a display P2a of "R activation" indicating that the rule base engine 18 has been activated and detection of the cat's face, and characters of "face detection".

次の画像Ｐ３は猫の顔が横向きで撮像されたものである。この場合には、推論エンジン１７により猫の横向きの顔が検出されて、顔の周囲を囲む破線枠の情報が制御部１１に供給される。この場合には、表示制御部１１ｆは、ステップＳ４５の次のステップＳ４７において、ライブビュー画像上に破線枠表示Ｐ３ｂを表示する。また、表示制御部１１ｆは、推論エンジン１７が発動されて猫の顔の検出が行われたことを示す「Ａ発動」の表示Ｐ３ａ及び「ＡＩ推論」の文字を表示する。 The next image P3 is an image of the cat's face taken sideways. In this case, the inference engine 17 detects the sideways face of the cat and supplies the control unit 11 with the information of the broken-line frame surrounding the face. In this case, the display control unit 11f displays a dashed frame display P3b on the live view image in step S47 following step S45. In addition, the display control unit 11f displays a display P3a of "A activation" indicating that the inference engine 17 has been activated and detection of the cat's face and characters of "AI inference".

次の画像Ｐ４も猫の顔が横向きで撮像されたものである。この場合には、推論エンジン１７からの情報に基づく破線枠表示Ｐ４ｂ、推論エンジン１７が発動されて猫の顔の検出が行われたことを示す「Ａ発動」の表示Ｐ４ａ及び「ＡＩ推論」の文字がライブビュー画像上に表示される。 The next image P4 is also taken with the cat's face facing sideways. In this case, the dashed frame display P4b based on the information from the inference engine 17, the "A activation" display P4a indicating that the inference engine 17 was activated and the cat's face was detected, and the "AI inference" display. Characters are displayed on the live view image.

また、次の画像Ｐ５は再び猫の顔が正面向きで撮像されたものである。この場合には、ルールベースエンジン１８により猫の顔が検出されて、ルールベースエンジン１８からの情報に基づく実線枠表示Ｐ５ｂ、ルールベースエンジン１８が発動されて猫の顔の検出が行われたことを示す「Ｒ発動」の表示Ｐ５ａ及び「顔検出」の文字がライブビュー画像上に表示される。 Further, the next image P5 is an image of the cat's face facing forward again. In this case, the rule base engine 18 detects the cat's face, the solid line frame display P5b based on the information from the rule base engine 18, and the rule base engine 18 is activated to detect the cat's face. A display P5a of "R activation" and characters of "Face detection" are displayed on the live view image.

最後の画像Ｐ６は、「発動」の表示Ｐ６ａにより、推論エンジン１７及びルールベースエンジン１８の推論による猫の検出は発動されていることが示されている。しかし、猫の顔は検出されていない状態を示している。 The final image P6 shows that the cat detection by the inference of the inference engine 17 and the rule base engine 18 is activated by the display P6a of "execution". However, the cat's face shows a non-detected state.

制御部１１は、ステップＳ４８において、動画撮影操作又は静止画撮影操作が行われたか否かを判定する。これらの操作が行われた場合には、撮影や記録を行う（ステップＳ４９）。なお、動画撮影時には、動画撮影の終了操作によって、撮像画像がファイル化される。撮影操作が行われていない場合には、制御部１１は処理をステップＳ３１に戻す。 In step S48, the control unit 11 determines whether or not a moving image shooting operation or a still image shooting operation has been performed. When these operations are performed, shooting and recording are performed (step S49). It should be noted that, at the time of moving image shooting, the captured image is made into a file by the end operation of moving image shooting. If the photographing operation has not been performed, the control unit 11 returns the processing to step S31.

このように、第１手法では、推論エンジン１７及び併用推論機器であるルールベースエンジン１８は、相互に独立して動作しており、例えば推論を分担して行う場合等に採用される。 As described above, in the first method, the inference engine 17 and the rule-based engine 18, which is a combined inference device, operate independently of each other.

（第２手法）
学習装置２０及び学習依頼装置３０における動作は、第１手法～第３手法において同一であるが、各第１手法～第３手法においては、仕様情報及び教師データが異なる。 (Second method)
The operations of the learning device 20 and the learning request device 30 are the same for the first to third methods, but the specification information and teacher data are different for each of the first to third methods.

図１５は、第２手法における学習装置２０による学習及び学習の結果得られる推論モデルを説明するための説明図である。 FIG. 15 is an explanatory diagram for explaining learning by the learning device 20 in the second method and an inference model obtained as a result of the learning.

図１５において、所定のネットワークＮ２には入力及び出力に対応する大量の画像が教師データとして与えられる。図１５の例は、猫の画像であって、顔が正面向きの猫の画像と顔が横向きの猫の画像とが入力される。そして、顔が正面向きの猫の画像では、顔部分の位置情報（実線枠）がアノテーションとして設定されており、顔が横向きの猫の画像では、顔部分を破線枠で囲むようにアノテーションが設定されている。更に、第２手法では、顔が正面向きの猫の画像については、併用推論機器の利用を促すためのアノテーションが設定されている。 In FIG. 15, a large number of images corresponding to inputs and outputs are given as training data to a predetermined network N2. The example in FIG. 15 is an image of a cat, and an image of a cat whose face is facing forward and an image of a cat whose face is facing sideways are input. In the image of the cat facing forward, the position information of the face (solid line frame) is set as an annotation, and in the image of the cat facing sideways, the annotation is set so that the face part is surrounded by the dashed frame. It is Furthermore, in the second method, an annotation is set for an image of a cat whose face is facing forward to encourage the use of a combination inference device.

これにより、ネットワークＮ２は、入力に対応する出力が得られるように、ネットワークデザインが決定される。即ち、図１５の例では、猫の画像が入力されると、猫の顔が正面を向いている場合には、併用推論機器の利用を促すと共に、その顔の画像部分の位置情報（実線枠）が信頼度の情報と共に得られ、猫の顔が横を向いている場合にはその顔の画像部分を囲む位置に破線枠を表示するための情報が信頼度の情報と共に得られる。 Thereby, the network design of the network N2 is determined so that the output corresponding to the input can be obtained. That is, in the example of FIG. 15, when an image of a cat is input, if the cat's face is facing forward, use of the combined inference device is urged, and the positional information of the image portion of the face (solid line frame) is displayed. ) is obtained along with the reliability information, and when the cat's face is turned sideways, information for displaying a dashed frame around the image portion of the face is obtained along with the reliability information.

図１６は撮像装置１０の動作を説明するためのフローチャートである。図１６において図１１と同一の手順には同一符号を付して説明を省略する。なお、第２手法においても、ルールベースエンジン１８は、図１２に示すように、正面を向いた猫の顔を検出するものとする。 FIG. 16 is a flowchart for explaining the operation of the imaging device 10. FIG. In FIG. 16, the same steps as in FIG. 11 are denoted by the same reference numerals, and descriptions thereof are omitted. Also in the second method, the rule-based engine 18 detects the face of a cat facing forward, as shown in FIG.

制御部１１は、撮影モードが指定されている場合には、ステップＳ３１からステップＳ４１に処理を移行して撮像画像を取り込む。撮像部１２からの撮像画像は制御部１１に与えられると共に、推論エンジン１７及びルールベースエンジン１８にも与えられる。制御部１１の推論設定部１１ｄは、推論モデル情報に付加された推論設定情報に基づいて、推論エンジン１７及びルールベースエンジン１８に対して、第２手法により推論を行うように制御する。 When the photographing mode is designated, the control unit 11 shifts the process from step S31 to step S41 and captures the captured image. A captured image from the imaging unit 12 is provided to the control unit 11 and also to the inference engine 17 and the rule base engine 18 . Based on the inference setting information added to the inference model information, the inference setting unit 11d of the control unit 11 controls the inference engine 17 and the rule base engine 18 to perform inference by the second method.

制御部１１は、次のステップＳ５２において、推論エンジン１７による推論結果が併用推論機器の利用を促すものであるか否かを判定する。 In the next step S52, the control unit 11 determines whether or not the inference result by the inference engine 17 prompts the use of the combined inference device.

いま、ある瞬間に図１３に示す撮像画像ＰＩ２が推論エンジン１７に入力されるものとする。推論エンジン１７は、学習装置２０又は学習依頼装置３０から提供を受けた推論モデル情報に基づくネットワーク１７ａにより、猫の横向きの顔の部分を検出する。推論エンジン１７は、検出結果として、猫の横向きの顔の部分を囲う破線枠ＰＯ２ａ（図１３参照）の情報を出力する。この情報は、推論エンジン１７から制御部１１に供給される。この場合には、制御部１１は、ステップＳ５２からステップＳ５３に移行して、推論の信頼性が高いか否かを判定する。制御部１１は、推論の信頼性が高い場合には、検出結果を表示する。即ち、表示制御部１１ｆは、推論エンジン１７の検出結果に基づいて、ライブビュー画像上に破線枠ＰＯ２ａを重畳した表示画像ＰＯ２を表示する。 Assume that the captured image PI2 shown in FIG. 13 is input to the inference engine 17 at a certain moment. The inference engine 17 uses the network 17a based on the inference model information provided by the learning device 20 or the learning requesting device 30 to detect the part of the cat's sideways face. The inference engine 17 outputs, as the detection result, the information of the broken-line frame PO2a (see FIG. 13) surrounding the sideways face of the cat. This information is supplied from the inference engine 17 to the control unit 11 . In this case, the control unit 11 proceeds from step S52 to step S53 and determines whether or not the reliability of the inference is high. The control unit 11 displays the detection result when the reliability of the inference is high. That is, based on the detection result of the inference engine 17, the display control unit 11f displays the display image PO2 in which the dashed frame PO2a is superimposed on the live view image.

また、ある瞬間に図１２に示す撮像画像ＰＩ１が推論エンジン１７及びルールベースエンジン１８に入力されるものとする。推論エンジン１７は、推論モデル情報に基づくネットワーク１７ａにより、猫の正面向きの顔の部分を検出する。推論エンジン１７は、検出結果として、猫の正面向きの顔の画像位置の情報及びルールベースエンジン１８に検出を促す情報を制御部１１に出力する。この場合には、制御部１１は、ステップＳ５２からステップＳ５５に処理を移行して、推論エンジン１７の推論結果、即ち、猫の顔の画像位置の情報をルールベースエンジン１８に与えて、ルールベースエンジン１８に猫の顔を検出させる。ルールベースエンジン１８は、公知の顔検出処理によって、猫の正面向きの顔の部分を検出する。この場合には、ルールベースエンジン１８には、猫の顔の画像位置の情報が与えられており、ルールベースエンジン１８は、より高精度に猫の顔の検出が可能である。 12 is inputted to the inference engine 17 and the rule base engine 18 at a certain moment. The inference engine 17 uses the network 17a based on the inference model information to detect the part of the cat's front facing face. The inference engine 17 outputs to the control unit 11 information on the image position of the cat's front facing face and information prompting the rule base engine 18 to detect the cat as the detection result. In this case, the control unit 11 shifts the process from step S52 to step S55, and provides the inference result of the inference engine 17, that is, the information of the image position of the cat's face to the rule base engine 18. Let the engine 18 detect the face of the cat. The rule-based engine 18 detects the part of the cat's front facing face by a known face detection process. In this case, the rule base engine 18 is provided with information on the image position of the cat's face, and the rule base engine 18 can detect the cat's face with higher accuracy.

ルールベースエンジン１８は、検出結果として、猫の顔の部分を囲う実線枠ＰＯ１ａ（図１２参照）の情報を出力する。この情報は、ルールベースエンジン１８から制御部１１に供給される。表示制御部１１ｆは、ルールベースエンジン１８の検出結果に基づいて、ライブビュー画像上に実線枠ＰＯ１ａを重畳した表示画像ＰＯ１を表示する（ステップＳ５６）。 The rule-based engine 18 outputs the information of the solid-line frame PO1a (see FIG. 12) surrounding the cat's face as the detection result. This information is supplied from the rule base engine 18 to the control unit 11 . The display control unit 11f displays the display image PO1 in which the solid line frame PO1a is superimposed on the live view image based on the detection result of the rule base engine 18 (step S56).

このように第２手法では、推論エンジン１７は、推論結果をルールベースエンジン１８に与えて、ルールベースエンジン１８において推論を行わせている。これにより、ルールベースエンジン１８において高い推論精度が得られる画像等について、より精度を向上させるための情報をルールベースエンジン１８に提供することができ、結果的に高い推論精度の検出が可能である。 As described above, in the second method, the inference engine 17 gives the inference result to the rule base engine 18 to make the rule base engine 18 perform inference. As a result, it is possible to provide the rule base engine 18 with information for further improving the accuracy of images and the like for which high inference accuracy can be obtained in the rule base engine 18, and as a result, detection with high inference accuracy is possible. .

（第３手法）
第３手法においても、学習装置２０及び学習依頼装置３０における動作は他の手法と同一であり、仕様情報及び教師データが他の手法と異なる。 (Third method)
In the third method as well, the operations of the learning device 20 and the learning request device 30 are the same as those of the other methods, and the specification information and teacher data are different from those of the other methods.

第３手法においては、学習装置２０において作成される推論モデルは、入力として、撮像画像だけでなく、併用推論機器の検出結果が用いられるように設定される。例えば、併用推論機器から猫の顔部の画像や猫の顔部の画像位置等の情報が入力され、撮像部１２から猫の画像が入力された場合には、猫の顔を検出する推論モデルが構築されるように、学習が行われる。 In the third method, the inference model created in the learning device 20 is set so that the detection result of the combined inference device is used as input in addition to the captured image. For example, when an image of a cat's face and information such as the position of the image of the cat's face are input from the combined inference device, and an image of the cat is input from the imaging unit 12, an inference model that detects the cat's face Learning takes place so that

図１７は撮像装置１０の動作を説明するためのフローチャートである。図１７において図１１と同一の手順には同一符号を付して説明を省略する。なお、第３手法においても、ルールベースエンジン１８は、図１２に示すように、正面を向いた猫の顔を検出するものとする。 FIG. 17 is a flow chart for explaining the operation of the imaging device 10. FIG. In FIG. 17, the same steps as in FIG. 11 are denoted by the same reference numerals, and descriptions thereof are omitted. Also in the third method, the rule-based engine 18 detects the face of a cat facing forward, as shown in FIG.

制御部１１は、撮影モードが指定されている場合には、ステップＳ３１からステップＳ４１に処理を移行して撮像画像を取り込む。撮像部１２からの撮像画像は制御部１１に与えられると共に、推論エンジン１７及びルールベースエンジン１８にも与えられる。制御部１１の推論設定部１１ｄは、推論モデル情報に付加された推論設定情報に基づいて、推論エンジン１７及びルールベースエンジン１８に対して、第３手法により推論を行うように制御する。 When the photographing mode is designated, the control unit 11 shifts the process from step S31 to step S41 and captures the captured image. A captured image from the imaging unit 12 is provided to the control unit 11 and also to the inference engine 17 and the rule base engine 18 . Based on the inference setting information added to the inference model information, the inference setting unit 11d of the control unit 11 controls the inference engine 17 and the rule base engine 18 to perform inference by the third method.

制御部１１は、ステップＳ４１の次のステップＳ６１において、ルールベースエンジン１８に推論を実行させて猫の顔を検出させる。ルールベースエンジン１８からの検出結果は制御部１１に与えられる。制御部１１は、ルールベースエンジン１８の検出結果の判定を行う（ステップＳ６２）。 In step S61 following step S41, the control unit 11 causes the rule-based engine 18 to execute inference and detect the cat's face. A detection result from the rule base engine 18 is given to the control unit 11 . The control unit 11 determines the detection result of the rule base engine 18 (step S62).

次に、制御部１１は、ステップＳ６３において、推論エンジン１７に推論を実行させる。この場合には、制御部１１は、ルールベースエンジン１８の検出結果を推論エンジン１７に与える。これにより、推論エンジン１７は、撮像部１２からの入力画像とルールベースエンジン１８の推論結果とが与えられ、これらの入力を用いた推論によって、推論結果を得る。推論エンジン１７の推論結果は制御部１１に出力される。 Next, the control unit 11 causes the inference engine 17 to perform inference in step S63. In this case, the control unit 11 gives the detection result of the rule base engine 18 to the inference engine 17 . As a result, the inference engine 17 receives the input image from the imaging unit 12 and the inference result of the rule base engine 18, and obtains the inference result by inference using these inputs. The inference result of the inference engine 17 is output to the control unit 11 .

制御部１１は、ステップＳ６４において、推論エンジン１７の推論結果について判定を行う。更に、制御部１１は、ステップＳ６５において、推論エンジン１７及びルールベースエンジン１８の推論結果に対する総合判定によって、検出結果を得る。 The control unit 11 determines the inference result of the inference engine 17 in step S64. Furthermore, in step S65, the control unit 11 obtains a detection result by comprehensively determining the inference results of the inference engine 17 and the rule base engine 18. FIG.

このように第３手法では、ルールベースエンジン１８は、推論結果を推論エンジン１７に与えており、推論エンジン１７の推論の精度をより向上させることができる。 Thus, in the third method, the rule base engine 18 provides the inference result to the inference engine 17, and the inference accuracy of the inference engine 17 can be further improved.

このように本実施の形態においては、併用推論機器の利用を想定して推論モデルを構築するための学習を行うことで、より有効な推論の実現が可能である。学習依頼装置はこのような併用推論機器の利用を想定した学習のための仕様情報を作成することで、学習装置において有効な推論モデルの構築を可能にする。また、学習装置は、推論モデルを利用する装置において、推論モデルの利用の仕方等を示す推論設定情報を推論モデル情報に付加して伝送する。これにより、推論モデルを利用する装置において、受信した推論モデル情報に付加された推論設定情報に基づいて、併用推論機器と推論モデルを用いる推論エンジンとを連携して使用することが可能となる。こうして、併用推論機器である例えばルールベースエンジンと推論エンジンとを効果的に動作させてより有効な推論結果を得ることが可能となる。 As described above, in the present embodiment, it is possible to realize more effective inference by performing learning for constructing an inference model assuming the use of a combined inference device. The learning requesting device creates specification information for learning on the assumption that such combined reasoning devices will be used, thereby enabling construction of an effective inference model in the learning device. Further, the learning device adds inference setting information indicating how to use the inference model to the inference model information and transmits the inference setting information in the device using the inference model. As a result, in a device that uses an inference model, it is possible to use the combined inference device and the inference engine that uses the inference model in cooperation based on the inference setting information added to the received inference model information. In this way, it is possible to effectively operate joint inference devices, such as a rule base engine and an inference engine, to obtain more effective inference results.

（第２の実施の形態）
図１８及び図１９は第２の実施の形態を示すブロック図である。本実施の形態は学習装置の構成が図１と異なり、図１８及び図１９は、図１の学習装置２０に追加する構成を示している。 (Second embodiment)
18 and 19 are block diagrams showing the second embodiment. 18 and 19 show a configuration added to the learning device 20 of FIG.

図１８及び図１９の画像データベース（ＤＢ）５１ａ及び正解データベース（ＤＢ）５１ｂは、図１の学習装置２０内の教師データ記録部２３に相当するものであり、教師データ中の画像データの集まりを画像ＤＢ５１ａとし、教師データ中のアノテーションデータの集まりを正解ＤＢ５１ｂとして表している。 An image database (DB) 51a and a correct answer database (DB) 51b in FIGS. 18 and 19 correspond to the training data recording unit 23 in the learning device 20 in FIG. An image DB 51a is shown, and a collection of annotation data in the teacher data is shown as a correct answer DB 51b.

本実施の形態における学習装置は、学習装置２０の構成の他に、併用推論機器５３、検出精度測定器５４、正解情報出力部５５、検出精度データベース（ＤＢ）５６及び検出精度情報出力部５７を備えている。制御部２１（図１参照）は、図１８に示すように、併用推論機器５３を用い、画像ＤＢ５１ａ中の各画像データ５２を併用推論機器５３に与えて推論処理を実行させるようになっている。併用推論機器５３の検出結果は検出精度測定器５４に与えられる。 In addition to the configuration of the learning device 20, the learning device of the present embodiment includes a combination reasoning device 53, a detection accuracy measuring device 54, a correct information output unit 55, a detection accuracy database (DB) 56, and a detection accuracy information output unit 57. I have it. As shown in FIG. 18, the control unit 21 (see FIG. 1) uses a combined inference device 53 and provides each image data 52 in the image DB 51a to the combined inference device 53 to execute inference processing. . The detection result of the combined inference device 53 is given to the detection accuracy measuring device 54 .

また、正解情報出力部５５は、正解ＤＢ５１ｂから併用推論機器５３に入力された画像データに対応する正解情報を取得して検出精度測定器５４に出力する。検出精度測定器５４は、併用推論機器５３の検出結果と正解情報との比較によって、併用推論機器５３の検出精度を測定して測定結果を検出精度ＤＢ５６に出力する。検出精度ＤＢ５６は、各画像毎に併用推論機器５３の推論の検出精度を格納する。 Further, the correct answer information output unit 55 acquires correct answer information corresponding to the image data input to the combined inference device 53 from the correct answer DB 51 b and outputs the correct answer information to the detection accuracy measuring device 54 . The detection accuracy measuring device 54 measures the detection accuracy of the combined reasoning device 53 by comparing the detection result of the combined reasoning device 53 and the correct answer information, and outputs the measurement result to the detection accuracy DB 56 . The detection accuracy DB 56 stores the detection accuracy of inference of the combined inference device 53 for each image.

入出力モデル化部２５は、推論モデルＮの学習時において、図１９に示すように、各画像データ５２を推論モデルＮに与え、正解情報出力部５５からの正解情報を推論モデルＮに与える。また、検出精度情報出力部５７は、推論モデルＮに入力される画像データ５２に対応する検出精度の情報を検出精度ＤＢ５６から読み出す。入出力モデル化部２５は、制御部２１に制御されて、検出精度情報出力部５７からの検出精度の情報を推論モデルＮの構築時に利用する。 During learning of the inference model N, the input/output modeling unit 25 provides each image data 52 to the inference model N, and provides correct information from the correct information output unit 55 to the inference model N, as shown in FIG. Further, the detection accuracy information output unit 57 reads information on detection accuracy corresponding to the image data 52 input to the inference model N from the detection accuracy DB 56 . The input/output modeling unit 25 is controlled by the control unit 21 and uses the detection accuracy information from the detection accuracy information output unit 57 when constructing the inference model N. FIG.

例えば、制御部２１は、推論モデルＮの構築に際して、教師データとして推論モデルＮに入力する画像に対応する検出精度情報を利用することで、検出精度が高い画像については併用推論機器による推論を促すように学習を行ってもよい。また、画像の検出精度が所定の閾値より高い場合にはこの画像についてのこれ以上の学習は不要と判定し、検出精度が所定の閾値よりも低い画像についてのみ学習を行うようになっていてもよい。この場合には、信頼度が比較的低くなる画像についての十分な学習が可能となり、全体として検出精度を向上させることができる可能性がある。 For example, when constructing the inference model N, the control unit 21 uses the detection accuracy information corresponding to the images input to the inference model N as teacher data, so that the combined inference device can be used to make inferences for images with high detection accuracy. You can learn as follows. In addition, if the detection accuracy of an image is higher than a predetermined threshold, it is determined that no further learning is necessary for this image, and learning is performed only for images whose detection accuracy is lower than the predetermined threshold. good. In this case, it is possible to sufficiently learn images with relatively low reliability, and it is possible to improve the detection accuracy as a whole.

このように本実施の形態においては、学習に際して併用推論機器による検出精度の情報を利用することができ、より有効な推論を可能にする推論モデルの構築が可能となる。 As described above, in the present embodiment, it is possible to use the information of the detection accuracy of the combined inference device at the time of learning, and to construct an inference model that enables more effective inference.

上記実施の形態においては、撮像のための機器として、デジタルカメラを用いて説明したが、カメラとしては、デジタル一眼レフカメラでもコンパクトデジタルカメラでもよく、ビデオカメラ、ムービーカメラでもよく、さらに、携帯電話やスマートフォンなど携帯情報端末（ＰＤＡ：Personal Digital Assist）等に内蔵されるカメラでも勿論構わない。 In the above embodiments, a digital camera was used as a device for imaging, but the camera may be a digital single-lens reflex camera, a compact digital camera, a video camera, a movie camera, or a mobile phone. A camera built in a personal digital assistant (PDA: Personal Digital Assist) such as a smart phone may be used.

本発明は、上記各実施形態にそのまま限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記各実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素の幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 The present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying the constituent elements without departing from the scope of the present invention at the implementation stage. Also, various inventions can be formed by appropriate combinations of the plurality of constituent elements disclosed in the above embodiments. For example, some components of all components shown in the embodiments may be omitted. Furthermore, components across different embodiments may be combined as appropriate.

なお、特許請求の範囲、明細書、および図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。また、これらの動作フローを構成する各ステップは、発明の本質に影響しない部分については、適宜省略も可能であることは言うまでもない。 Regarding the operation flow in the claims, the specification, and the drawings, even if explanations are made using "first," "next," etc. for the sake of convenience, it is essential to implement them in this order. does not mean Further, it goes without saying that each step constituting these operation flows can be appropriately omitted as long as it does not affect the essence of the invention.

なお、ここで説明した技術のうち、主にフローチャートで説明した制御に関しては、プログラムで設定可能であることが多く、記録媒体や記録部に収められる場合もある。この記録媒体、記録部への記録の仕方は、製品出荷時に記録してもよく、配布された記録媒体を利用してもよく、インターネットを介してダウンロードしたものでもよい。 It should be noted that, among the techniques described here, the control described mainly in the flow charts can often be set by a program, and may be stored in a recording medium or a recording unit. The method of recording in the recording medium and the recording unit may be recorded at the time of product shipment, using a distributed recording medium, or downloading via the Internet.

なお、実施例中で、「部」（セクションやユニット）として記載した部分は、専用の回路や、複数の汎用の回路を組み合わせて構成してもよく、必要に応じて、予めプログラムされたソフトウェアに従って動作を行うマイコン、ＣＰＵなどのプロセッサ、あるいはＦＰＧＡなどシーケンサを組み合わせて構成されてもよい。また、その制御の一部または全部を外部の装置が引き受けるような設計も可能で、この場合、有線や無線の通信回路が介在する。通信は、ブルートゥースやＷｉＦｉ、電話回線などで行えばよく、ＵＳＢなどで行っても良い。専用の回路、汎用の回路や制御部を一体としてＡＳＩＣとして構成してもよい。 It should be noted that, in the embodiments, portions described as "parts" (sections or units) may be configured by combining a dedicated circuit or a plurality of general-purpose circuits, and if necessary, pre-programmed software A processor such as a microcomputer, a CPU, or a sequencer such as an FPGA may be combined to operate according to the above. It is also possible to design a part or all of the control by an external device, in which case a wired or wireless communication circuit is interposed. Communication may be performed by Bluetooth, WiFi, telephone line, or the like, and may be performed by USB or the like. A dedicated circuit, a general-purpose circuit, and a control unit may be integrated into an ASIC.

１…撮像装置、１０…制御部、１１…画像処理部、１１ａ…画質改善部、１２…表示制御部、１３…記録制御部、１４…行為検出部、２０…撮像部、２１…光学系、２２…撮像素子、２３…変位付与部、３０…特定範囲推定部、３１…シーン判定部、３２…良好構図記憶部、３３…改善予測部、４１…表示部、４２…操作部、４３…センサ部、４４…記録部。 DESCRIPTION OF SYMBOLS 1... Imaging device, 10... Control part, 11... Image processing part, 11a... Image quality improvement part, 12... Display control part, 13... Recording control part, 14... Action detection part, 20... Imaging part, 21... Optical system, 22... Imaging device 23... Displacement applying unit 30... Specific range estimating unit 31... Scene determining unit 32... Good composition storing unit 33... Improvement predicting unit 41... Display unit 42... Operation unit 43... Sensor Part 44... Recording part.

Claims

a specification setting unit that generates specification information describing settings of a plurality of specification items for requesting specification of an inference model;
a control unit that performs control for transmitting the specification information together with teacher data to a learning device that generates the inference model;
The specification setting unit includes, in the specification information, a specification item for setting whether or not the inference engine using the inference model performs inference in cooperation with another combination inference device. request device.

The specification setting unit may include, in the specification information, specification items for setting information on functions of the other combination inference device and information on input/output between the inference engine and the other combination inference device. 2. The learning request device according to claim 1.

2. The learning request device according to claim 1, wherein said specification setting unit includes specification items for setting priorities of said plurality of specification items in said specification information.

Specification information that describes the settings of multiple specification items for determining the specifications of the inference model, and is specification information for the inference engine that uses the above inference model to perform inference in cooperation with other concurrent inference devices. an inference modeling unit that builds the inference model based on
a control unit for performing control for adding inference setting information regarding a method of cooperation with the combined inference device to the inference model information for constructing the inference model based on the specification information and transmitting the inference setting information; A learning device characterized by:

5. The learning device according to claim 4, wherein the control unit adds the specification information to the inference model information as the inference setting information.

5. The learning device according to claim 4, wherein the control unit adds the inference setting information to the inference model information as header information of the inference model information.

Inference model information for constructing an inference model on the premise that the inference engine and the combined inference device cooperate to perform inference, and which is inference setting information related to how the inference engine and the combined inference device are linked. a communication unit that receives the inference model information to which is added;
the inference engine that performs inference using an inference model configured based on the inference model information;
the combined inference device;
An inference model utilization apparatus, comprising: a control unit that causes the inference engine and the combined inference device to cooperate and execute inference based on the inference setting information.

The control unit operates the inference engine and the combined inference device independently of each other based on the inference setting information, and determines the reliability of the first inference result of the inference engine and the first inference result of the combined inference device. 8. The inference model utilization apparatus according to claim 7, wherein the first or second inference result is obtained based on the second reliability of the inference result.

3. The control unit, based on the inference setting information, provides the first inference result of the inference engine to the combined inference device to obtain the second inference result of the combined inference device. 8. The inference model utilization device according to 7.

8. The controller obtains the first inference result of the inference engine by giving the second inference result of the combined inference device to the inference engine based on the inference setting information. The inference model utilization device described in .

Inference model information for constructing an inference model on the premise that the inference engine and the combined inference device cooperate to perform inference, and which is inference setting information related to how the inference engine and the combined inference device are linked. receives the above inference model information with
building the inference engine using an inference model configured based on the inference model information;
An inference model utilization method characterized by causing the inference engine and the combined inference device to cooperate to execute inference based on the inference setting information.

to the computer,
Inference model information for constructing an inference model on the premise that the inference engine and the combined inference device cooperate to perform inference, and which is inference setting information related to how the inference engine and the combined inference device are linked. receives the above inference model information with
constructing the inference engine using an inference model configured based on the inference model information;
An inference model utilization program for executing a procedure for executing inference by linking the inference engine and the combination inference device based on the inference setting information.

an imaging unit that acquires a captured image of a subject;
Inference model information for constructing an inference model on the premise that the inference engine and the combined inference device cooperate to perform inference, and which is inference setting information related to how the inference engine and the combined inference device are linked. a communication unit that receives the inference model information to which is added;
the inference engine that performs inference using an inference model configured based on the inference model information;
the combined inference device;
a control unit that detects a predetermined object from the captured image by causing the inference engine and the combined inference device to perform inference based on the inference setting information. Imaging device.