JP2022070747A

JP2022070747A - Information processing apparatus and information processing method

Info

Publication number: JP2022070747A
Application number: JP2020179983A
Authority: JP
Inventors: 将史瀧本; Masafumi Takimoto; 竜也山本; Tatsuya Yamamoto; 英太小野; Eita Ono; 悟間宮; Satoru Mamiya; 茂樹弘岡; Shigeki Hirooka
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-10-27
Filing date: 2020-10-27
Publication date: 2022-05-13

Abstract

To provide a technique for enabling processing by a learning model according to a situation even if processing is difficult based on only information collected in the past, or even if there is no information collected in the past.SOLUTION: One or more learning models are selected, as candidate learning models, from a plurality of learning models learned under learning environments different from each other based on information concerning image capturing of an object. One or more candidate learning models are selected from the candidate learning models based on a result of object detection processing by the selected candidate learning models. The object detection processing is performed using a candidate learning model of at least one of the selected candidate learning models.SELECTED DRAWING: Figure 1

Description

本発明は、撮影画像に基づく予測のための技術に関するものである。 The present invention relates to a technique for prediction based on captured images.

近年、農業において、収量の予測、最適な収穫時期の予測や農薬散布量の制御、圃場修復計画等、多様な問題解決に役立てるために、ＩＴ化によって課題を解決する取り組みが盛んに行われている。 In recent years, in agriculture, efforts have been actively made to solve problems by using IT in order to help solve various problems such as yield prediction, optimum harvest time prediction, pesticide spraying amount control, and field restoration planning. There is.

例えば、特許文献１には、農作物を生育させる場から取得したセンサ情報とそれら情報を格納したデータベースを適宜参照することにより生育状況や収穫予測を早期に把握し、生育の異常状態を早期に発見して対処する方法が開示されている。 For example, in Patent Document 1, the growth situation and the harvest prediction can be grasped at an early stage by appropriately referring to the sensor information acquired from the place where the crop is grown and the database storing the information, and the abnormal state of the growth can be detected at an early stage. And how to deal with it is disclosed.

また、特許文献２には、農作物に関する多種多様なセンサから獲得した情報を基に登録済の情報を参照して任意の推論を行うことで農作物の品質や収量のばらつきを抑制する圃場管理を行う方法が開示されている。 Further, in Patent Document 2, field management is performed to suppress variations in the quality and yield of agricultural products by making arbitrary inferences by referring to the registered information based on the information acquired from various sensors related to agricultural products. The method is disclosed.

特開２００５－１３７２０９号公報Japanese Unexamined Patent Publication No. 2005-137209 特開２０１６－４９１０２号公報Japanese Unexamined Patent Publication No. 2016-49102

しかしながら、従来から提案されてきた方式では、予測等を実施する圃場に関して過去取得した事例を充分数保持し、該事例に関する情報を基に予測事項が精度良く推定できるような調整作業が済んでいることが前提となっている。 However, in the method that has been proposed conventionally, a sufficient number of cases acquired in the past regarding the field where the prediction etc. are carried out are retained, and adjustment work has been completed so that the predicted items can be estimated accurately based on the information on the cases. Is the premise.

一方で、農作物の出来不出来は一般的に、天候・気候等の環境の変動に大きく影響を受け、作業者による肥料・農薬等の散布状態によっても大きく異なる。全ての外的要因による条件が毎年不変であるならば、収量の予測や収穫時期の予測等は実施する必要すら無くなるが、工業と異なり農業は作業者自ら制御不可能な外的要因が多いため、予測は非常に困難である。また、未経験の天候が続いた場合の収量等を予測するような場合、上記の過去に取得した事例から調整された推定システムでは正しい予測が困難である。 On the other hand, the quality of agricultural products is generally greatly affected by environmental changes such as weather and climate, and varies greatly depending on the spraying conditions of fertilizers, pesticides, etc. by workers. If the conditions due to all external factors are unchanged every year, it is not even necessary to predict the yield and harvest time, but unlike industry, agriculture has many external factors that workers cannot control themselves. , Prediction is very difficult. In addition, when predicting the yield, etc. when inexperienced weather continues, it is difficult to make a correct prediction with the estimation system adjusted from the above-mentioned cases acquired in the past.

最も予測が困難なケースは、新規に上記の予測システムを圃場に導入した場合である。例えば、特定の圃場で収量の予測や、生育不良な領域（枯れ枝・病変）を修繕することを目的とした非生産領域の検出を行う場合を考える。こういったタスクにおいては通常、上記の圃場で過去に収集した農作物に関する画像やパラメータをデータベースに保持しておく。そして、実際に圃場に対して予測等を実施する際には、観測された現在の圃場で撮影された画像やその他センサから取得された生育情報に関わるデータを相互に参照して調整し、精度良く予測する。しかし、上記の如く、これらの予測システムや非生産領域検出器を異なる新規の圃場にも導入した場合、（圃場の）条件が合致しないことが多いために、すぐに適用することができない。こういった場合は、新規の圃場で充分な数のデータの収集を実施して調整するという作業が必要であった。 The most difficult case is when the above prediction system is newly introduced in the field. For example, consider the case of predicting the yield in a specific field and detecting a non-producing area for the purpose of repairing a poorly growing area (dead branch / lesion). For these tasks, images and parameters related to crops collected in the past in the above fields are usually stored in a database. Then, when actually making predictions for the field, adjustments are made by mutually referring to the observed images taken in the current field and other data related to growth information acquired from the sensor, and making adjustments. Predict well. However, as described above, when these prediction systems and non-production area detectors are introduced into different new fields, they cannot be applied immediately because the conditions (of the fields) often do not match. In such cases, it was necessary to collect and adjust a sufficient number of data in the new field.

また、上記の予測システムや非生産領域検出器の調整を人手による調整で行う場合、農作物の生育に関わるパラメータは高次元になるため多くの手間がかかる。また、ディープラーニングやそれらに準じた機械学習的手法で実施する場合であっても、新規の入力に対して良い性能を発揮するためには通常、人手によるラベル付与（アノテーション）作業が必要となるため、作業コストが大きくかかってしまう。 In addition, when the above prediction system and the non-production area detector are manually adjusted, the parameters related to the growth of the crop become high-dimensional, so that it takes a lot of time and effort. In addition, even when deep learning or a machine learning method similar to them is used, manual labeling (annotation) work is usually required to achieve good performance for new inputs. Therefore, the work cost is large.

本来であれば、予測システムを新規に導入する際や、過去に無かった天災や天候の場合であっても、ユーザの負荷が少ない簡易な設定で良好な予測・推定を行うことが好ましい。 Originally, it is preferable to perform good prediction / estimation with simple settings that reduce the load on the user, even when a new prediction system is introduced, or even in the case of a natural disaster or weather that has never occurred in the past.

本発明では、過去に収集している情報のみから処理が困難である場合や、過去に収集した情報が無い場合であっても、状況に応じた学習モデルによる処理を可能にするための技術を提供する。 In the present invention, there is a technique for enabling processing by a learning model according to a situation even when it is difficult to process only from the information collected in the past or when there is no information collected in the past. offer.

本発明の一様態は、オブジェクトの撮影に係る情報に基づいて、互いに異なる学習環境において学習した複数の学習モデルから１以上の学習モデルを候補学習モデルとして選択する第１選択手段と、前記第１選択手段が選択した候補学習モデルによるオブジェクト検出処理の結果に基づいて、該候補学習モデルから１以上の候補学習モデルを選択する第２選択手段と、前記第２選択手段が選択した候補学習モデルのうちの少なくともいずれか一つの候補学習モデルを用いて、前記オブジェクトの撮影画像に対するオブジェクト検出処理を行う検出手段とを備えることを特徴とする。 The uniformity of the present invention includes a first selection means for selecting one or more learning models as candidate learning models from a plurality of learning models learned in different learning environments based on information related to image shooting of an object, and the first selection means. A second selection means that selects one or more candidate learning models from the candidate learning model based on the result of object detection processing by the candidate learning model selected by the selection means, and a candidate learning model selected by the second selection means. It is characterized by comprising a detection means for performing an object detection process on a captured image of the object by using at least one of the candidate learning models.

本発明の構成によれば、過去に収集している情報のみから処理が困難である場合や、過去に収集した情報が無い場合であっても、状況に応じた学習モデルによる処理を可能にすることができる。 According to the configuration of the present invention, even when it is difficult to process only from the information collected in the past or when there is no information collected in the past, it is possible to process by the learning model according to the situation. be able to.

システムの構成例を示す図。The figure which shows the configuration example of a system. システムが行う処理のフローチャート。A flow chart of the processing performed by the system. ステップＳ２３における処理の詳細を示すフローチャート。The flowchart which shows the detail of the process in step S23. ステップＳ２３３における処理の詳細を示すフローチャート。The flowchart which shows the detail of the process in step S233. カメラ１０による圃場の撮影方法の一例を示す図。The figure which shows an example of the image | photographing method of a field by a camera 10. 困難な事例を示す図。The figure which shows the difficult case. 撮影画像に対してアノテーション作業を行った結果を示す図。The figure which shows the result of performing the annotation work on the photographed image. ＧＵＩの表示例を示す図。The figure which shows the display example of GUI. ＧＵＩの表示例を示す図。The figure which shows the display example of GUI. システムが行う処理のフローチャート。A flow chart of the processing performed by the system. ステップＳ８３における処理の詳細を示すフローチャート。The flowchart which shows the detail of the process in step S83. ステップＳ８３３における処理の詳細を示すフローチャート。The flowchart which shows the detail of the process in step S833. 検出領域の検出例を示す図。The figure which shows the detection example of the detection area. ＧＵＩの表示例を示す図。The figure which shows the display example of GUI. （Ａ）はクエリパラメータの構成例を示す図、（Ｂ）は学習モデルのパラメータセットの構成例を示す図、（Ｃ）はクエリパラメータの構成例を示す図。(A) is a diagram showing a configuration example of a query parameter, (B) is a diagram showing a configuration example of a parameter set of a learning model, and (C) is a diagram showing a configuration example of a query parameter.

以下、添付図面を参照して実施形態を詳しく説明する。尚、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。実施形態には複数の特徴が記載されているが、これらの複数の特徴の全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. The following embodiments do not limit the invention according to the claims. Although a plurality of features are described in the embodiment, not all of the plurality of features are essential for the invention, and the plurality of features may be arbitrarily combined. Further, in the attached drawings, the same or similar configurations are given the same reference numbers, and duplicate explanations are omitted.

［第１の実施形態］
本実施形態では、カメラによって撮影された圃場の撮影画像から、該圃場における農作物の収量の予測や修繕箇所の検出等、該圃場の分析処理を行うシステムについて説明する。 [First Embodiment]
In the present embodiment, a system for performing analysis processing of the field such as prediction of the yield of agricultural products in the field and detection of repaired parts from the photographed image of the field taken by the camera will be described.

まず、本実施形態に係るシステムの構成例について、図１を用いて説明する。図１に示す如く、本実施形態に係るシステムは、カメラ１０、クラウドサーバ１２、情報処理装置１３を有する。 First, a configuration example of the system according to the present embodiment will be described with reference to FIG. As shown in FIG. 1, the system according to the present embodiment includes a camera 10, a cloud server 12, and an information processing device 13.

まず、カメラ１０について説明する。カメラ１０は圃場の動画像を撮影し、該動画像における各フレームの画像を「圃場の撮影画像」として出力する。もしくはカメラ１０は圃場の静止画像を定期的もしくは不定期的に撮影し、該撮影した静止画像を「圃場の撮影画像」として出力する。撮影画像から後述の予測を正確に行うためには、同じ圃場で撮影した画像は可能な限り同じ環境、条件で撮影されていることが望ましい。カメラ１０から出力された撮影画像はＬＡＮやインターネットなどの通信網１１を介してクラウドサーバ１２や情報処理装置１３に対して送信される。 First, the camera 10 will be described. The camera 10 captures a moving image of the field and outputs an image of each frame in the moving image as a “photographed image of the field”. Alternatively, the camera 10 regularly or irregularly captures a still image of the field, and outputs the captured still image as a “photographed image of the field”. In order to accurately make the predictions described below from the captured images, it is desirable that the images captured in the same field are captured in the same environment and conditions as much as possible. The captured image output from the camera 10 is transmitted to the cloud server 12 and the information processing device 13 via a communication network 11 such as a LAN or the Internet.

カメラ１０による圃場の撮影方法は特定の撮影方法に限らない。カメラ１０による圃場の撮影方法の一例を図３（Ａ）を用いて説明する。図３（Ａ）ではカメラ１０としてカメラ３３およびカメラ３４を用いている。一般的な圃場では、農家によって計画的に植えられた農作物の木が列をなしており、例えば、図３（Ａ）に示す如く、農作物の木の列３０や農作物の木の列３１のように、何列も農作物の木が並んで植えられている。農作業用トラクター３２には、矢印で示している進行方向において左側の農作物の木の列３１を撮影するカメラ３４と、右側の農作物の木の列３０を撮影するカメラ３３と、が設けられている。よって農作業用トラクター３２が列３０と列３１との間を矢印で示す進行方向に移動すると、カメラ３４は列３１における農作物の木の撮影画像を複数枚撮影することになり、カメラ３３は列３０における農作物の木の撮影画像を複数枚撮影することになる。 The method of photographing the field with the camera 10 is not limited to a specific photographing method. An example of a method of photographing a field with a camera 10 will be described with reference to FIG. 3 (A). In FIG. 3A, the camera 33 and the camera 34 are used as the camera 10. In a general field, crop trees systematically planted by farmers are lined up, for example, as shown in FIG. 3A, such as a row of crop trees 30 and a row of crop trees 31. In addition, many rows of crop trees are planted side by side. The agricultural work tractor 32 is provided with a camera 34 for photographing the row 31 of the crop trees on the left side in the traveling direction indicated by the arrow, and a camera 33 for photographing the row 30 of the crop trees on the right side. .. Therefore, when the agricultural work tractor 32 moves between the row 30 and the row 31 in the traveling direction indicated by the arrow, the camera 34 captures a plurality of captured images of the crop trees in the row 31, and the camera 33 captures a plurality of captured images of the agricultural product tree in the row 31. I will take multiple images of the trees of the agricultural products in.

農作業用トラクター３２が入って作業するようにデザインされ、等間隔に農作物の木が植えられているような多くの圃場では、図３（Ａ）に示す如く農作業用トラクター３２に設置されたカメラ３３，３４で農作物の木を撮影することで、より多くの農作物の木を一定の高さで農作物の木から一定の距離を保った状態で撮影することが比較的容易に実現できる。そのため、ほとんど同じ条件で対象の圃場全ての画像を撮影することが可能となり、望ましい条件での画像撮影が容易に実現される。 In many fields where the agricultural tractor 32 is designed to work and crop trees are planted at regular intervals, the camera 33 installed on the agricultural tractor 32 as shown in FIG. 3 (A). By taking a picture of a crop tree at, 34, it is relatively easy to take a picture of more crop trees at a certain height and at a certain distance from the crop tree. Therefore, it is possible to take an image of all the target fields under almost the same conditions, and it is easy to take an image under desirable conditions.

なお、概ね同等の条件で圃場の撮影を行うことが可能であれば、他の撮影方法を採用しても良い。カメラ１０による圃場の撮影方法の一例を図３（Ｂ）を用いて説明する。図３（Ｂ）ではカメラ１０としてカメラ３８およびカメラ３９を用いている。図３（Ｂ）に示す如く、農作物の木の列３５と農作物の木の列３６との間の間隔が狭く、トラクターによる走行が不可能な圃場等では、ドローン３７に取り付けたカメラ３８およびカメラ３９による撮影でも良い。ドローン３７には、矢印で示している進行方向において左側の農作物の木の列３６を撮影するカメラ３９と、右側の農作物の木の列３５を撮影するカメラ３８と、が設けられている。よってドローン３７が列３５と列３６との間を矢印で示す進行方向に移動すると、カメラ３９は列３６における農作物の木の撮影画像を複数枚撮影することになり、カメラ３８は列３５における農作物の木の撮影画像を複数枚撮影することになる。 If it is possible to photograph the field under substantially the same conditions, another imaging method may be adopted. An example of a method of photographing a field with a camera 10 will be described with reference to FIG. 3 (B). In FIG. 3B, the camera 38 and the camera 39 are used as the camera 10. As shown in FIG. 3B, in a field or the like where the distance between the row of crop trees 35 and the row of crop trees 36 is narrow and it is not possible to travel by a tractor, the camera 38 and the camera attached to the drone 37 are used. It may be taken by 39. The drone 37 is provided with a camera 39 for photographing the row 36 of the crop trees on the left side in the traveling direction indicated by the arrow, and a camera 38 for photographing the row 35 of the crop trees on the right side. Therefore, when the drone 37 moves between the row 35 and the row 36 in the traveling direction indicated by the arrow, the camera 39 captures a plurality of captured images of the crop trees in the row 36, and the camera 38 captures a plurality of captured images of the crop trees in the row 35. You will be shooting multiple images of the tree.

また、自走ロボットに設置されたカメラによって農作物の木の撮影画像を撮影するようにしても良い。また、撮影に用いるカメラの数は図３（Ａ）、（Ｂ）では２としているが、特定の数に限らない。 Further, a camera installed on the self-propelled robot may be used to take a photographed image of a crop tree. Further, the number of cameras used for shooting is set to 2 in FIGS. 3A and 3B, but the number is not limited to a specific number.

農作物の木の撮影画像をどのような撮影方法で撮影したとしても、カメラ１０は、撮像画像には、該撮影画像の撮影時における撮影情報（撮影位置（例えばＧＰＳによって測定された撮影位置）、撮影日時、カメラ１０に係る情報等が記録されたＥｘｉｆ情報）を添付して出力する。 Regardless of the shooting method used to capture the captured image of the tree of the agricultural product, the camera 10 includes the captured information (shooting position (for example, the shooting position measured by GPS)) at the time of shooting the captured image. Exif information in which the shooting date and time, information related to the camera 10 and the like are recorded) is attached and output.

次に、クラウドサーバ１２について説明する。クラウドサーバ１２には、カメラ１０から送信される撮影画像およびＥｘｉｆ情報が登録される。また、クラウドサーバ１２には、撮影画像から農作物に係る画像領域を検出するための学習モデル（検出器／設定）が複数登録されており、それぞれの学習モデルは互いに異なる学習環境で学習したモデルである。そしてクラウドサーバ１２は、自信が保持している複数の学習モデルのうち、撮影画像から農作物に係る画像領域を検出する際に用いる学習モデルの候補を選択して情報処理装置１３に提示する。 Next, the cloud server 12 will be described. The captured image and Exif information transmitted from the camera 10 are registered in the cloud server 12. In addition, a plurality of learning models (detectors / settings) for detecting an image area related to agricultural products from captured images are registered in the cloud server 12, and each learning model is a model learned in different learning environments. be. Then, the cloud server 12 selects a learning model candidate to be used when detecting an image area related to an agricultural product from a photographed image from a plurality of learning models held by self-confidence, and presents the candidate to the information processing apparatus 13.

ＣＰＵ１９１は、ＲＡＭ１９２やＲＯＭ１９３に格納されているコンピュータプログラムやデータを用いて各種の処理を実行する。これによりＣＰＵ１９１は、クラウドサーバ１２全体の動作制御を行うと共に、クラウドサーバ１２が行うものとして説明する各種の処理を実行もしくは制御する。 The CPU 191 executes various processes using computer programs and data stored in the RAM 192 and the ROM 193. As a result, the CPU 191 controls the operation of the entire cloud server 12, and also executes or controls various processes described as those performed by the cloud server 12.

ＲＡＭ１９２は、ＲＯＭ１９３や外部記憶装置１９６からロードされたコンピュータプログラムやデータを格納するためのエリア、Ｉ／Ｆ１９７を介して外部から受信したデータを格納するためのエリア、を有する。さらにＲＡＭ１９２は、ＣＰＵ１９１が各種の処理を実行する際に用いるワークエリアを有する。このようにＲＡＭ１９２は、各種のエリアを適宜提供することができる。 The RAM 192 has an area for storing computer programs and data loaded from the ROM 193 and the external storage device 196, and an area for storing data received from the outside via the I / F 197. Further, the RAM 192 has a work area used by the CPU 191 to execute various processes. As described above, the RAM 192 can appropriately provide various areas.

ＲＯＭ１９３には、クラウドサーバ１２の設定データ、クラウドサーバ１２の起動に係るコンピュータプログラムやデータ、クラウドサーバ１２の基本動作に係るコンピュータプログラムやデータ、などが格納されている。 The ROM 193 stores setting data of the cloud server 12, computer programs and data related to the startup of the cloud server 12, computer programs and data related to the basic operation of the cloud server 12, and the like.

操作部１９４は、キーボード、マウス、タッチパネルなどのユーザインターフェースであり、ユーザが操作することで各種の指示をＣＰＵ１９１に対して入力することができる。 The operation unit 194 is a user interface such as a keyboard, a mouse, and a touch panel, and various instructions can be input to the CPU 191 by the user operating the operation unit 194.

表示部１９５は、液晶画面やタッチパネル画面などの画面を有し、ＣＰＵ１９１による処理結果を画像や文字などでもって表示することができる。なお、表示部１９５は、画像や文字を投影するプロジェクタなどの投影装置であっても良い。 The display unit 195 has a screen such as a liquid crystal screen or a touch panel screen, and can display the processing result by the CPU 191 with an image, characters, or the like. The display unit 195 may be a projection device such as a projector that projects an image or characters.

外部記憶装置１９６は、ハードディスクドライブ装置などの大容量情報記憶装置である。外部記憶装置１９６には、ＯＳ（オペレーティングシステム）や、クラウドサーバ１２が行うものとして説明する各種の処理をＣＰＵ１９１に実行もしくは制御させるためのコンピュータプログラムやデータが保存されている。外部記憶装置１９６に保存されているデータには、上記の学習モデルに係るデータも含まれている。外部記憶装置１９６に保存されているコンピュータプログラムやデータは、ＣＰＵ１９１による制御に従って適宜ＲＡＭ１９２にロードされ、ＣＰＵ１９１による処理対象となる。 The external storage device 196 is a large-capacity information storage device such as a hard disk drive device. The external storage device 196 stores computer programs and data for causing the CPU 191 to execute or control the OS (operating system) and various processes described as those performed by the cloud server 12. The data stored in the external storage device 196 also includes the data related to the above learning model. The computer programs and data stored in the external storage device 196 are appropriately loaded into the RAM 192 according to the control by the CPU 191 and become the processing target by the CPU 191.

Ｉ／Ｆ１９７は、外部とのデータ通信を行うための通信インターフェースであり、クラウドサーバ１２は、Ｉ／Ｆ１９７を介して外部とのデータの送受信を行う。ＣＰＵ１９１、ＲＡＭ１９２、ＲＯＭ１９３、操作部１９４、表示部１９５、外部記憶装置１９６、Ｉ／Ｆ１９７、は何れもシステムバス１９８に接続されている。なお、クラウドサーバ１２の構成は図１に示した構成に限らない。 The I / F 197 is a communication interface for performing data communication with the outside, and the cloud server 12 transmits / receives data to / from the outside via the I / F 197. The CPU 191 and the RAM 192, the ROM 193, the operation unit 194, the display unit 195, the external storage device 196, and the I / F 197 are all connected to the system bus 198. The configuration of the cloud server 12 is not limited to the configuration shown in FIG.

なお、カメラ１０から出力された撮影画像およびＥｘｉｆ情報を一時的に他の装置のメモリに格納し、該メモリから通信網１１を介してクラウドサーバ１２に該撮影画像およびＥｘｉｆ情報を転送するようにしても良い。 The captured image and Exif information output from the camera 10 are temporarily stored in the memory of another device, and the captured image and Exif information are transferred from the memory to the cloud server 12 via the communication network 11. May be.

次に、情報処理装置１３について説明する。情報処理装置１３は、ＰＣ（パーソナルコンピュータ）、スマートフォン、タブレット端末装置、などのコンピュータ装置である。情報処理装置１３は、クラウドサーバ１２によって提示された学習モデルの候補をユーザに提示してユーザからの学習モデルの選択を受け付け、ユーザにより選択された学習モデルをクラウドサーバ１２に通知する。クラウドサーバ１２は、情報処理装置１３から通知された学習モデル（候補からユーザが選択した学習モデル）用いて、カメラ１０による撮影画像から農作物に係る画像領域の検出（オブジェクト検出処理）を行って、上記の分析処理を行う。 Next, the information processing apparatus 13 will be described. The information processing device 13 is a computer device such as a PC (personal computer), a smartphone, or a tablet terminal device. The information processing apparatus 13 presents a learning model candidate presented by the cloud server 12 to the user, accepts the selection of the learning model from the user, and notifies the cloud server 12 of the learning model selected by the user. The cloud server 12 detects an image area related to agricultural products (object detection processing) from an image taken by the camera 10 by using a learning model (a learning model selected by a user from candidates) notified from the information processing device 13. Perform the above analysis process.

ＣＰＵ１３１は、ＲＡＭ１３２やＲＯＭ１３３に格納されているコンピュータプログラムやデータを用いて各種の処理を行う。これによりＣＰＵ１３１は、情報処理装置１３全体の動作制御を行うと共に、情報処理装置１３が行うものとして説明する各種の処理を実行もしくは制御する。 The CPU 131 performs various processes using computer programs and data stored in the RAM 132 and the ROM 133. As a result, the CPU 131 controls the operation of the entire information processing apparatus 13, and also executes or controls various processes described as those performed by the information processing apparatus 13.

ＲＡＭ１３２は、ＲＯＭ１３３からロードされたコンピュータプログラムやデータを格納するためのエリア、入力Ｉ／Ｆ１３５を介してカメラ１０やクラウドサーバ１２から受信したデータを格納するためのエリア、を有する。さらにＲＡＭ１３２は、ＣＰＵ１３１が各種の処理を実行する際に用いるワークエリアを有する。このように、ＲＡＭ１３２は、各種のエリアを適宜提供することができる。 The RAM 132 has an area for storing computer programs and data loaded from the ROM 133, and an area for storing data received from the camera 10 and the cloud server 12 via the input I / F 135. Further, the RAM 132 has a work area used by the CPU 131 to execute various processes. As described above, the RAM 132 can appropriately provide various areas.

ＲＯＭ１３３には、情報処理装置１３の設定データ、情報処理装置１３の起動に係るコンピュータプログラムやデータ、情報処理装置１３の基本動作に係るコンピュータプログラムやデータ、などが格納されている。 The ROM 133 stores setting data of the information processing device 13, computer programs and data related to the activation of the information processing device 13, computer programs and data related to the basic operation of the information processing device 13, and the like.

出力Ｉ／Ｆ１３４は、情報処理装置１３が各種の情報を外部に出力／送信するために用いるインターフェースである。 The output I / F 134 is an interface used by the information processing apparatus 13 to output / transmit various types of information to the outside.

入力Ｉ／Ｆ１３５は、情報処理装置１３が各種の情報を外部から入力／受信するために用いるインターフェースである。 The input I / F 135 is an interface used by the information processing apparatus 13 to input / receive various types of information from the outside.

表示装置１４は、液晶画面やタッチパネル画面を有し、ＣＰＵ１３１による処理結果を画像や文字などでもって表示することができる。なお、表示装置１４は、画像や文字を投影するプロジェクタなどの投影装置であっても良い。 The display device 14 has a liquid crystal screen and a touch panel screen, and can display the processing result by the CPU 131 with images, characters, and the like. The display device 14 may be a projection device such as a projector that projects an image or characters.

ユーザインターフェース１５は、キーボードやマウスを含み、ユーザが操作することで各種の指示をＣＰＵ１３１に対して入力することができる。なお、情報処理装置１３の構成は図１に示した構成に限らず、例えば、ハードディスクドライブ装置などの大容量情報記憶装置を有し、該ハードディスクドライブ装置に後述するＧＵＩなどのコンピュータプログラムやデータを保存しておいても良い。また、ユーザインターフェース１５には、タッチパネルなどのタッチセンサを含めても良い。 The user interface 15 includes a keyboard and a mouse, and various instructions can be input to the CPU 131 by being operated by the user. The configuration of the information processing device 13 is not limited to the configuration shown in FIG. 1, and for example, it has a large-capacity information storage device such as a hard disk drive device, and a computer program or data such as a GUI described later is stored in the hard disk drive device. You may save it. Further, the user interface 15 may include a touch sensor such as a touch panel.

次に、カメラ１０により撮影された圃場の撮影画像から、該圃場で収穫される農作物の収量を収穫時期よりも早い段階で予測するタスクの流れについて説明する。単純に収穫時期に収穫対象である果実等をカウントすることによって収穫量を予測する場合、単純に特定物体検出と称される方法で対象果実を撮影画像から識別器によって検出すれば目的が達成される。これは、果実自体が極めて特徴的な外観を有しているため、この特徴的な外観を学習した識別器によって検出する方法である。 Next, the flow of the task of predicting the yield of the crops harvested in the field at an earlier stage than the harvest time from the photographed image of the field taken by the camera 10 will be described. When predicting the yield by simply counting the fruits to be harvested at the time of harvest, the purpose is achieved by simply detecting the target fruit from the photographed image by a method called specific object detection. To. This is a method of detecting this characteristic appearance by a learned discriminator because the fruit itself has a very characteristic appearance.

本実施形態では、農作物が果実の場合は、該果実が成熟した後に該果実をカウントすることのみに留まらず、収穫時期よりも早い段階で該果実の収量を予測する。例えば、後に果実となる花序を検出してその数から収量を予測したり、果実が生る可能性の低い枯れ枝や病変領域を検出することで収量を予測したり、木の葉の生い茂り方の状態から収量を予測したりする。このような予測を行うためには、撮影時期や気候によって農作物の生育状況が異なってくることに対応した予測方法が必要となる。つまり、農作物の状況に応じて、予測性能が良い予測方式を選択する必要が有る。この場合、予測対象の圃場に合致した学習モデルによって上記の予測を適切に行うことが期待される。 In the present embodiment, when the crop is a fruit, the yield of the fruit is predicted at a stage earlier than the harvest time, not only by counting the fruit after the fruit has matured. For example, the yield can be predicted by detecting the inflorescences that will become fruits later, and the yield can be predicted by detecting the dead branches and lesion areas where the fruit is unlikely to grow, or from the state of the overgrown leaves. Predict the yield. In order to make such a prediction, it is necessary to have a prediction method that corresponds to the difference in the growth condition of crops depending on the shooting time and climate. In other words, it is necessary to select a prediction method with good prediction performance according to the situation of the crop. In this case, it is expected that the above prediction will be made appropriately by a learning model that matches the field to be predicted.

ここで、撮影画像内に写る様々なオブジェクトを農作物の木の幹クラス、枝クラス、枯れ枝クラス、支柱クラス等のクラスに分類し、クラスによって収量を予測する。撮影時期によって木の幹クラスや枝クラス等のクラスに属するオブジェクトの外観は変わるため、万能な予測は困難である。このような困難な事例を図４に示す。 Here, various objects shown in the photographed image are classified into classes such as a trunk class, a branch class, a dead branch class, and a prop class of a crop, and the yield is predicted by the class. Since the appearance of objects belonging to classes such as the trunk class and branch class of trees changes depending on the shooting time, it is difficult to make a universal prediction. A difficult case like this is shown in FIG.

図４（Ａ）および図４（Ｂ）は、上記のカメラ１０によって撮影された撮影画像の一例を示している。これらの撮影画像には、ほぼ等間隔に農作物の木が写っているが、収穫される予定の果実等が未だなっていないので、該撮影画像からは果実を検出するタスクは実行できない。図４（Ａ）の撮影画像中の木は、比較的シーズンの早い段階で撮影された農作物の木であり、図４（Ｂ）の撮影画像中の木は、ある程度葉が生い茂った段階で撮影された木である。図４（Ａ）の撮影画像では、どの木の枝も同程度葉が有るため、生育不良な領域は無いと判断でき、全て収穫可能な領域と判定することができる。一方で図４（Ｂ）の撮影画像では、該撮影画像中の中央領域４１付近の枝の葉の生い茂り方が明らかに他と異なっており、生育不良と判断することは容易である。しかし、中央領域４１（葉が少ない領域）の様子は、図４（Ａ）の撮影画像中の領域４０付近でも、同様のパターンとして見つけることができる。この２つの事例が示すことは、農産物の木の異常領域は局所的なパターンでは判定不可能ということである。つまり、上記の特定物体検出のような局所パターンのみの入力で判断はできず、画像全体から得られるコンテキストを反映させることが必要となる。 4 (A) and 4 (B) show an example of a photographed image taken by the above camera 10. Although the cropped trees are shown at almost equal intervals in these captured images, the task of detecting the fruits from the captured images cannot be executed because the fruits to be harvested have not yet been obtained. The tree in the photographed image of FIG. 4 (A) is a crop tree photographed at a relatively early stage of the season, and the tree in the photographed image of FIG. 4 (B) is photographed at a stage where leaves are overgrown to some extent. It is a tree that has been made. In the photographed image of FIG. 4A, since all the branches of the tree have leaves to the same extent, it can be determined that there is no region with poor growth, and it can be determined that all the regions can be harvested. On the other hand, in the photographed image of FIG. 4B, the way the leaves of the branches grow in the vicinity of the central region 41 in the photographed image is clearly different from the others, and it is easy to judge that the growth is poor. However, the state of the central region 41 (the region with few leaves) can be found as a similar pattern even in the vicinity of the region 40 in the captured image of FIG. 4 (A). These two cases show that anomalous areas of agricultural trees cannot be determined by local patterns. That is, it is not possible to make a judgment by inputting only a local pattern as in the above-mentioned specific object detection, and it is necessary to reflect the context obtained from the entire image.

つまり、過去に同様の生育状況の農作物を同様の条件で撮影した画像で学習した学習モデルを用いて上記の特定物体検出を行わなければ、充分な性能を発揮することができない。 That is, sufficient performance cannot be exhibited unless the above-mentioned specific object detection is performed using a learning model learned from images taken under the same conditions of crops with similar growth conditions in the past.

過去に撮影したことのない新規の圃場で撮影した画像が入力された場合や、日照りが続いた、雨量が極端に多かった等の何かの外的要因によって以前に撮影した条件と異なる条件における画像が入力された場合のみならず、ユーザが都合の良い時期に撮影した画像が入力された場合など、のあらゆるケースに対応するためには、毎度、入力画像の条件に近い条件で学習した学習モデルを獲得する必要がある。 Under conditions that differ from previously taken due to some external factor, such as when an image taken in a new field that has not been taken in the past is input, or due to continued sunshine, extremely heavy rainfall, etc. In order to deal with all cases, such as when an image is input but also when an image taken at a convenient time by the user is input, learning learned under conditions close to the conditions of the input image each time. You need to get a model.

ここで、圃場の撮影を行うたびに毎回アノテーション作業とディープラーニングによる学習を実施する場合に、どのようなアノテーション作業が必要となるのかについて説明する。例えば、図４（Ａ）、図４（Ｂ）のそれぞれの撮影画像に対してアノテーション作業を行った結果を、それぞれ図５（Ａ）、図５（Ｂ）に示す。 Here, what kind of annotation work is required when performing annotation work and learning by deep learning every time a field is photographed will be described. For example, the results of annotation work on the captured images of FIGS. 4 (A) and 4 (B) are shown in FIGS. 5 (A) and 5 (B), respectively.

図５（Ａ）の撮影画像における矩形領域５００～５０４がアノテーション作業によって指定された画像領域である。矩形領域５００は、正常な枝の領域として指定された画像領域であり、矩形領域５０１～５０４は、木の幹の領域として指定された画像領域である。矩形領域５００は、木の生育に関して正常な状態を表している画像領域であるため、該画像領域が収量の予測に大きく関連する領域となる。以下では、矩形領域５００のような、木の生育に関して正常な状態を表す領域、果実等が収穫可能な部分の領域、を生産領域と称する。 The rectangular areas 500 to 504 in the captured image of FIG. 5A are image areas designated by the annotation work. The rectangular area 500 is an image area designated as a normal branch area, and the rectangular areas 501 to 504 are image areas designated as a tree trunk area. Since the rectangular region 500 is an image region representing a normal state with respect to the growth of the tree, the image region is a region largely related to the prediction of yield. Hereinafter, a region such as the rectangular region 500, which represents a normal state with respect to the growth of a tree, and a region where fruits and the like can be harvested are referred to as a production region.

図５（Ｂ）の撮影画像における矩形領域５０５～５０７、５１１～５１４がアノテーション作業によって指定された画像領域である。矩形領域５０５，５０７は、正常な枝の領域として指定された画像領域であり、矩形領域５０６は、異常な枯れ枝の領域として指定された画像領域である。矩形領域５０６のような、異常な状態を表す領域、果実等が収穫不可能な部分の領域、を非生産領域と称する。矩形領域５１１～５１４は、木の幹の領域として指定された画像領域である。果実等が収穫可能な部分の領域（生産領域）と判断される画像領域は矩形領域５０５、５０７であるから、該矩形領域５０５，５０７が収量の予測に大きく関連する領域となる。 Rectangular areas 505 to 507 and 511 to 514 in the captured image of FIG. 5B are image areas designated by the annotation work. The rectangular areas 505 and 507 are image areas designated as normal branch areas, and the rectangular areas 506 are image areas designated as abnormal dead branch areas. A region representing an abnormal state, a region where fruits and the like cannot be harvested, such as a rectangular region 506, is referred to as a non-production region. The rectangular areas 511 to 514 are image areas designated as the area of the trunk of the tree. Since the image regions determined to be the regions (production regions) where fruits and the like can be harvested are the rectangular regions 505 and 507, the rectangular regions 505 and 507 are regions that are greatly related to the prediction of yield.

このようなアノテーション作業を圃場の撮影の度に多数（例えば、数百～数千枚）の撮影画像に対して実施するには、非常にコストがかかる。そこで、本実施形態では、このようなより煩わしいアノテーション作業を実施せずに、良好な予測結果を獲得する。本実施形態では、ディープラーニングによって学習モデルを獲得する。しかし、学習モデルの獲得方法は特定の獲得方法に限らない。また、様々なオブジェクト検出器を学習モデルの代わりに適用してもかまわない。 It is very costly to carry out such annotation work on a large number (for example, hundreds to thousands) of captured images each time the field is photographed. Therefore, in the present embodiment, good prediction results are obtained without performing such more troublesome annotation work. In this embodiment, a learning model is acquired by deep learning. However, the acquisition method of the learning model is not limited to a specific acquisition method. Also, various object detectors may be applied instead of the learning model.

次に、カメラ１０によって撮影された圃場の撮影画像に基づいて該圃場における収量の予測や該圃場全体に対する非生産率の計算等の分析処理を行うために本実施形態に係るシステムが行う処理について、図２Ａのフローチャートに従って説明する。 Next, about the processing performed by the system according to the present embodiment in order to perform analysis processing such as prediction of yield in the field and calculation of non-production rate for the entire field based on the photographed image of the field taken by the camera 10. , 2A will be described according to the flowchart of FIG. 2A.

ステップＳ２０では、カメラ１０は、農作業用トラクター３２やドローン３７などの移動体が移動中に圃場を撮影することで該圃場の撮影画像を生成する。 In step S20, the camera 10 generates a photographed image of the field by photographing the field while a moving object such as a farm work tractor 32 or a drone 37 is moving.

ステップＳ２１では、カメラ１０は、ステップＳ２０で生成した撮影画像に上記のＥｘｉｆ情報（撮影情報）を添付し、該Ｅｘｉｆ情報が添付された撮影画像を通信網１１を介してクラウドサーバ１２および情報処理装置１３に対して送信する。 In step S21, the camera 10 attaches the above Exif information (shooting information) to the captured image generated in step S20, and processes the captured image to which the Exif information is attached to the cloud server 12 and information processing via the communication network 11. It transmits to the device 13.

ステップＳ２２では、情報処理装置１３のＣＰＵ１３１は、カメラ１０が撮影した圃場や農作物などに関する情報（農作物の品種や樹齢、農作物の育成法や剪定法等）を撮影圃場パラメータとして取得する。例えば、ＣＰＵ１３１は、図６（Ａ）に示すＧＵＩ（グラフィカルユーザインターフェース）を表示装置１４に表示させて、ユーザからの撮影圃場パラメータの入力を受け付ける。 In step S22, the CPU 131 of the information processing apparatus 13 acquires information (crop variety and age, crop growing method, pruning method, etc.) about the field and the crop photographed by the camera 10 as the photographed field parameter. For example, the CPU 131 displays the GUI (graphical user interface) shown in FIG. 6A on the display device 14 and accepts input of a shooting field parameter from the user.

図６（Ａ）のＧＵＩにおいて領域６００には、圃場全体のマップが表示される。領域６００に表示される圃場のマップは複数の区分に分かれており、それぞれの区分には該区分に固有の識別子（ＩＤ）が表示されている。ユーザはユーザインターフェース１５を操作して、カメラ１０により撮影を行った区分（つまりこれから上記の分析処理を行いたい区分）に該当する領域６００内の箇所を指定するか、若しくは該区分の識別子を領域６０１に入力する。ユーザがユーザインターフェース１５を操作して、カメラ１０により撮影を行った区分に該当する領域６００内の箇所を指定した場合、該区分の識別子が領域６０１に表示される。 In the GUI of FIG. 6A, a map of the entire field is displayed in the area 600. The map of the field displayed in the area 600 is divided into a plurality of sections, and an identifier (ID) unique to the section is displayed in each section. The user operates the user interface 15 to specify a portion in the area 600 corresponding to the division taken by the camera 10 (that is, the division to be subjected to the above analysis processing), or the identifier of the division is used as the area. Enter in 601. When the user operates the user interface 15 to specify a portion in the area 600 corresponding to the division taken by the camera 10, the identifier of the division is displayed in the area 601.

ユーザはユーザインターフェース１５を操作して領域６０２に作物名（農作物の名称）を入力することができる。またユーザはユーザインターフェース１５を操作して領域６０３に農作物の品種を入力することができる。またユーザはユーザインターフェース１５を操作して領域６０４にＴｒｅｌｌｉｓを入力することができる。Ｔｒｅｌｌｉｓとは、例えば、農作物が葡萄である場合、葡萄圃場で葡萄を生育させるための葡萄の木のデザイン方法である。またユーザはユーザインターフェース１５を操作して領域６０５にＰｌａｎｔｅｄＹｅａｒを入力することができる。ＰｌａｎｔｅｄＹｅａｒとは、例えば、農作物が葡萄である場合、葡萄の木を植えた時期を表す。なお、これら全ての項目について撮影圃場パラメータを入力することは必須ではない。 The user can operate the user interface 15 to input a crop name (agricultural product name) in the area 602. Further, the user can operate the user interface 15 to input the crop varieties in the area 603. Further, the user can operate the user interface 15 to input Trellis in the area 604. Trellis is, for example, a method of designing a vine for growing vines in a vineyard when the crop is vines. Further, the user can operate the user interface 15 to input the planted year in the area 605. The Planted Year indicates, for example, when the crop is vine, the time when the vine is planted. It is not essential to enter the photographed field parameters for all of these items.

そしてユーザがユーザインターフェース１５を操作して登録ボタン６０６を指示すると、情報処理装置１３のＣＰＵ１３１は、図６（Ａ）のＧＵＩにおいて入力された上記の各項目の撮影圃場パラメータをクラウドサーバ１２に対して送信する。クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から送信された撮影圃場パラメータを外部記憶装置１９６に保存（登録）する。 Then, when the user operates the user interface 15 to instruct the registration button 606, the CPU 131 of the information processing apparatus 13 transmits the photographing field parameters of the above items input in the GUI of FIG. 6A to the cloud server 12. And send. The CPU 191 of the cloud server 12 stores (registers) the photographed field parameters transmitted from the information processing apparatus 13 in the external storage device 196.

また、ユーザがユーザインターフェース１５を操作して修正ボタン６０７を指示すると、情報処理装置１３のＣＰＵ１３１は、図６（Ａ）のＧＵＩにおいて入力済みの撮影圃場パラメータの修正を可能にする。 Further, when the user operates the user interface 15 to instruct the correction button 607, the CPU 131 of the information processing apparatus 13 enables the correction of the photographing field parameters already input in the GUI of FIG. 6A.

図６（Ａ）のＧＵＩは、特に葡萄の圃場を管理することを前提とした撮影圃場パラメータを入力させるためのＧＵＩであるが、同様の目的であったとしても、ユーザに入力させる撮影圃場パラメータは図６（Ａ）に示したものに限らない。また、農作物が葡萄でない場合であっても同様に、ユーザに入力させる撮影圃場パラメータは図６（Ａ）に示したものに限らない。例えば、領域６０２に入力する作物名を変更すると、領域６０３～６０５のタイトルおよび入力させる撮影圃場パラメータを変更するようにしても良い。 The GUI of FIG. 6A is a GUI for inputting a photographed field parameter on the premise of managing a grape field in particular, but even if the purpose is the same, the photographed field parameter to be input by the user is used. Is not limited to that shown in FIG. 6 (A). Further, even when the crop is not grape, the photographed field parameter to be input by the user is not limited to that shown in FIG. 6 (A). For example, when the crop name to be input to the area 602 is changed, the title of the areas 603 to 605 and the photographed field parameter to be input may be changed.

図６（Ａ）のＧＵＩで入力した撮影圃場パラメータは、基本的には一度決定した後は固定のまま利用すれば良いため、例えば、毎年圃場の撮影を行って収量を予測する場合、既に登録済の撮影圃場パラメータを呼び出して利用できる。所望の区分について既に撮影圃場パラメータが登録されていれば、次回からは図６（Ｂ）に示す如く、領域６００内で該所望の区分に該当する箇所を指示することで、該区分に対応する撮影圃場パラメータが領域６０９～６１３に表示される。 Basically, the photographed field parameters input by the GUI in FIG. 6 (A) can be used as they are once they are determined. Therefore, for example, when the field is photographed every year and the yield is predicted, it has already been registered. It can be used by recalling the photographed field parameters. If the photographed field parameters have already been registered for the desired category, from the next time onward, as shown in FIG. 6 (B), the location corresponding to the desired category is indicated in the area 600 to correspond to the category. The imaging field parameters are displayed in areas 609-613.

ここで、正しい撮影圃場パラメータを全て入力することが後段の学習モデル選択のためにも望ましいが、ユーザにとって不明であるがために入力できなかった撮影圃場パラメータがあったとしても、不明のまま後続する処理を行うことができる。 Here, it is desirable to input all the correct photographed field parameters for the learning model selection in the subsequent stage, but even if there are photographed field parameters that could not be input because they are unknown to the user, they remain unknown and follow. Can be processed.

ステップＳ２３では、撮影画像から農作物などのオブジェクトを検出するために用いる学習モデルの候補を選択するための処理が行われる。ステップＳ２３における処理の詳細について、図２Ｂのフローチャートに従って説明する。 In step S23, a process for selecting a learning model candidate to be used for detecting an object such as an agricultural product from the captured image is performed. The details of the process in step S23 will be described with reference to the flowchart of FIG. 2B.

ステップＳ２３０では、クラウドサーバ１２のＣＰＵ１９１は、カメラ１０から取得したそれぞれの撮影画像に添付されているＥｘｉｆ情報と、外部記憶装置１９６に登録されている撮影圃場パラメータ（撮影画像に対応する区分の撮影圃場パラメータ）と、からクエリパラメータを生成する。 In step S230, the CPU 191 of the cloud server 12 has Exif information attached to each photographed image acquired from the camera 10 and a photographed field parameter registered in the external storage device 196 (photographing of a category corresponding to the photographed image). Field parameters) and generate query parameters from.

クエリパラメータの構成例を図１１（Ａ）に示す。図１１（Ａ）のクエリパラメータは、図６（Ｂ）の撮影圃場パラメータが入力された場合に生成されるクエリパラメータである。 A configuration example of the query parameter is shown in FIG. 11 (A). The query parameter of FIG. 11A is a query parameter generated when the photographed field parameter of FIG. 6B is input.

「クエリ名」には、領域６０９に入力された「Ｆ５」が設定されている。「品種」には、領域６１１に入力された「Ｓｈｉｒａｚ」が設定されている。「Ｔｒｅｌｌｉｓ」には、領域６１２に入力された「Ｓｃｏｔｔ－Ｈｅｎｒｙ」が設定されている。「樹齢」には、領域６１３に入力された「２００１」からＥｘｉｆ情報に含まれている撮影日時（年）までの経過年数が樹齢「１９」として設定されている。「撮影日」には、Ｅｘｉｆ情報に含まれている撮影日時（月日）「Ｏｃｔ２０」が設定されている。「撮影時間帯」には、カメラ１０から受信したそれぞれの撮影画像に添付されているＥｘｉｆ情報中の撮影日時（時間）のうち最も過去の撮影日時（時間）から最近の撮影日時（時間）までの間の時間帯「１２：００－１４：００」が設定されている。「緯度、経度」には、Ｅｘｉｆ情報に含まれている撮影位置「３５°２８’Ｓ，１４９°１２’Ｅ」が設定されている。 In the "query name", "F5" entered in the area 609 is set. The "Shiraz" input to the area 611 is set in the "product type". In "Trellis", "Scott-Henry" input to the area 612 is set. In the "tree age", the number of years elapsed from "2001" input to the area 613 to the shooting date and time (year) included in the Exif information is set as the tree age "19". In the "shooting date", the shooting date and time (month and day) "Oct 20" included in the Exif information is set. The "shooting time zone" is from the latest shooting date / time (time) to the latest shooting date / time (time) among the shooting date / time (time) in the Exif information attached to each shot image received from the camera 10. The time zone "12: 00-14: 00" between them is set. In "latitude, longitude", the shooting position "35 ° 28'S, 149 ° 12'E" included in the Exif information is set.

なお、クエリパラメータの生成方法は上記の方法に限らず、例えば、農作物の農家が既に圃場管理で用いているデータを読み込み、上記の項目に一致するパラメータの集合をクエリパラメータとしても良い。 The method of generating the query parameter is not limited to the above method, and for example, the data already used by the farmer of the crop in the field management may be read, and a set of parameters matching the above items may be used as the query parameter.

なお、場合によっては一部の項目に関する情報が不明となっている場合も有り得る。例えばＰｌａｎｔｅｄＹｅａｒや品種に関する情報が分からない場合、図１１（Ａ）に例示したような全項目を埋めることができない。この場合のクエリパラメータは図１１（Ｃ）に示す如く、一部空欄になる。 In some cases, information on some items may be unknown. For example, if the information about the Planted Year and the variety is not known, all the items as illustrated in FIG. 11 (A) cannot be filled. As shown in FIG. 11C, the query parameters in this case are partially blank.

次に、ステップＳ２３１では、クラウドサーバ１２のＣＰＵ１９１は、外部記憶装置１９６に保存しているＥ個（Ｅは２以上の整数）の学習モデルのうち候補となるＭ（１≦Ｍ＜Ｅ）個の学習モデル（候補学習モデル）を選択する。該選択では、クエリパラメータが示す環境と類似する環境に基づいて学習した学習モデルを候補学習モデルとして選択する。外部記憶装置１９６には、Ｅ個の学習モデルのそれぞれについて、該学習モデルがどのような環境に基づいて学習したのかを示すパラメータセットが保存されている。外部記憶装置１９６におけるそれぞれの学習モデルのパラメータセットの構成例を図１１（Ｂ）に示す。 Next, in step S231, the CPU 191 of the cloud server 12 has M (1 ≦ M <E) candidate learning models among the E learning models (E is an integer of 2 or more) stored in the external storage device 196. Select the learning model (candidate learning model) of. In the selection, a learning model learned based on an environment similar to the environment indicated by the query parameter is selected as a candidate learning model. The external storage device 196 stores, for each of the E learning models, a parameter set indicating the environment in which the learning model was trained. FIG. 11B shows a configuration example of the parameter set of each learning model in the external storage device 196.

「モデル名」は、学習モデルの名称であり、「品種」は、該学習モデルが学習した農作物の品種であり、「Ｔｒｅｌｌｉｓ」は、該学習モデルが学習した「葡萄圃場で葡萄を生育させるための葡萄の木のデザイン方法」である。「樹齢」は、該学習モデルが学習した農作物の樹齢であり、「撮影日」は、該学習モデルが学習に使用した農作物の撮影画像の撮影日時である。「撮影時間帯」は、該学習モデルが学習に使用した農作物の撮影画像のうち最古の撮影日時から最近の撮影日時までの間の期間であり、「緯度、経度」は、該学習モデルが学習に使用した農作物の撮影画像の撮影位置「３５°２８’Ｓ，１４９°１２’Ｅ」である。 "Model name" is the name of the learning model, "variety" is the variety of the crop learned by the learning model, and "Trellis" is to grow grapes in the "grape field" learned by the learning model. How to design a vine. " The "tree age" is the age of the crop learned by the learning model, and the "shooting date" is the shooting date and time of the captured image of the crop used by the learning model for learning. The "shooting time zone" is the period from the oldest shooting date and time to the latest shooting date and time of the captured images of the crops used for learning by the learning model, and the "latitude, longitude" is the period from the learning model. The shooting position of the captured image of the agricultural product used for learning is "35 ° 28'S, 149 ° 12'E".

学習モデルによっては、複数の圃場のブロックで収集されたデータセットを混在させて学習しているものもある。そのため、例えばモデル名が「Ｍ００４」、「Ｍ００５」の学習モデルのように、複数の設定（品種や樹齢等）を含むようにパラメータセットが設定されているものがあっても良い。 Some training models are trained by mixing data sets collected in blocks of multiple fields. Therefore, for example, there may be a learning model whose model names are "M004" and "M005", in which a parameter set is set so as to include a plurality of settings (variety, tree age, etc.).

よってクラウドサーバ１２のＣＰＵ１９１は、クエリパラメータと、図１１（Ｂ）に示す学習モデルごとのパラメータセットと、の類似度を求め、該類似度が高い順に上位Ｍ個の学習モデルを候補学習モデルとして選択する。 Therefore, the CPU 191 of the cloud server 12 obtains the similarity between the query parameters and the parameter set for each learning model shown in FIG. 11B, and uses the top M learning models as candidate learning models in descending order of the similarity. select.

モデル名＝Ｍ００１、Ｍ００２，…のそれぞれの学習モデルのパラメータセットをＭ_１，Ｍ_２，…と表記すると、クラウドサーバ１２のＣＰＵ１９１は、クエリパラメータＱと、パラメータセットＭ_ｘと、の類似度Ｄ（Ｑ，Ｍ_ｘ）を以下の式（１）を計算することで求める。 When the parameter set of each training model of model name = M001, M002, ... Is expressed as M ₁ , M ₂ , ..., The CPU 191 of the cloud server 12 has a similarity D between the query parameter Q and the parameter set M _x . (Q, M _x ) is obtained by calculating the following equation (1).

ここで、ｑ_ｋはクエリパラメータＱにおいて先頭からｋ番目の要素を表す。図１１（Ａ）の場合、クエリパラメータＱには、「品種」、「Ｔｒｅｌｌｉｓ」、「樹齢」、「撮影日」、「撮影時間帯」、「緯度、経度」の６つの要素が含まれているため、ｋ＝１～６である。 Here, q _k represents the kth element from the beginning in the query parameter Q. In the case of FIG. 11A, the query parameter Q includes six elements of "cultivar", "Trellis", "tree age", "shooting date", "shooting time zone", and "latitude, longitude". Therefore, k = 1 to 6.

ｍ_ｘ、ｋはパラメータセットＭ_ｘにおいて先頭からｋ番目の要素を表す。図１１（Ｂ）の場合、パラメータセットには、「品種」、「Ｔｒｅｌｌｉｓ」、「樹齢」、「撮影日」、「撮影時間帯」、「緯度、経度」の６つの要素が含まれているため、ｋ＝１～６である。 m _{x and k} represent the kth element from the beginning in the parameter set M _x . In the case of FIG. 11B, the parameter set includes six elements of "cultivar", "Trellis", "age", "shooting date", "shooting time zone", and "latitude, longitude". Therefore, k = 1 to 6.

ｆ_ｋ（ａ_ｋ、ｂ_ｋ）は、要素ａ_ｋとｂ_ｋとの間の距離を求めるための関数であり、予め設定されている。ｆ_ｋ（ａ_ｋ、ｂ_ｋ）は、事前に実験により注意深く設定しても良いが、上記の式（１）による距離定義は基本的に性質の異なる学習モデル程大きな値になるようになっていれば良いため、以下のように簡易に設定すれば良い。 f _k ( _ak , b _k ) is a function for finding the distance between the elements a _k and b _k , and is preset. Although f _k ( _ak , b _k ) may be carefully set by experiment in advance, the distance definition by the above equation (1) basically has a larger value as the learning model has different properties. Therefore, you can simply set as follows.

つまり、基本的に要素は、分類要素（品種、Ｔｒｅｌｌｉｓ）である場合と、連続値要素（樹齢、撮影日…）である場合と、の２類種に分けられる。よって、分類要素間の距離を規定する関数は以下の式（２）のように定義し、連続値要素間の距離を規定する関数は以下の式（３）のように定義する。 That is, basically, the element is divided into two types, one is a classification element (variety, Trellis) and the other is a continuous value element (tree age, shooting date ...). Therefore, the function that defines the distance between the classification elements is defined as the following equation (2), and the function that defines the distance between the continuous value elements is defined as the following equation (3).

全ての要素（ｋ）に対する関数は事前にルールベースで実装しておく。また、各々の要素の最終的なモデル間距離への影響度に応じてα_ｋを決めておく。例えば、「品種」（ｋ＝１）による違いはそれ程画像の違いに表れないため、α_１は０に限りなく近づけ、「Ｔｒｅｌｌｉｓ」（ｋ＝２）の違いは大きく影響するため、α_２は大きく設定しておく、というように予め調整しておく。 Functions for all elements (k) are implemented in advance on a rule basis. In addition, α _k is determined according to the degree of influence of each element on the final inter-model distance. For example, since the difference depending on the "variety" (k = 1) does not appear so much in the difference in the image, α ₁ is as close to 0 as possible, and the difference in “Trellis” (k = 2) has a great influence, so α ₂ is Make adjustments in advance, such as setting a large value.

また、図１１（Ｂ）のモデル名が「Ｍ００４」、「Ｍ００５」の学習モデルのように、「品種」や「樹齢」に複数の設定が登録されている学習モデルの場合、例えば「品種」の場合は、「品種」に登録されているそれぞれの設定について距離を求め、その平均距離を「品種」に対応する距離とする。「樹齢」の場合も同様に、「樹齢」に登録されているそれぞれの設定について距離を求め、その平均距離を「樹齢」に対応する距離とする。 Further, in the case of a learning model in which a plurality of settings are registered for "variety" and "tree age", such as a learning model in which the model names in FIG. 11B are "M004" and "M005", for example, "variety". In the case of, the distance is obtained for each setting registered in the "variety", and the average distance is set as the distance corresponding to the "variety". Similarly, in the case of "tree age", the distance is obtained for each setting registered in "tree age", and the average distance is defined as the distance corresponding to "tree age".

なお、クラウドサーバ１２のＣＰＵ１９１は、上記の類似度に基づいてＭ個の学習モデルを候補学習モデルとして選択するのであれば、その選択方法は特定の選択方法に限らない。例えば、クラウドサーバ１２のＣＰＵ１９１は、閾値以上の類似度を有するＭ個の学習モデルを選択するようにしても良い。 If the CPU 191 of the cloud server 12 selects M learning models as candidate learning models based on the above similarity, the selection method is not limited to the specific selection method. For example, the CPU 191 of the cloud server 12 may select M learning models having a degree of similarity equal to or higher than a threshold value.

ただし、クエリパラメータにおける要素が何れも空の場合は、ステップＳ２３１における処理は行われず、その結果、全ての学習モデルを候補学習モデルとして以降の処理が行われることになる。 However, if any of the elements in the query parameter is empty, the processing in step S231 is not performed, and as a result, all the training models are used as candidate learning models and the subsequent processing is performed.

候補学習モデルを選択することによる効果は、多岐に及ぶ。まず、事前知識として可能性の低い学習モデルを本ステップで排除することで、以降続く学習モデルのスコアリングによるランキング作成などに係る処理時間を大幅にカットすることができる。また、ルールベースによる学習モデルのスコアリングであっても、本来比較する必要の無い学習モデルをも候補に入れると、学習モデルの選択精度を落とす可能性があるが、その可能性を最小限に留めることができる。 The effects of selecting a candidate learning model are wide-ranging. First, by excluding learning models that are unlikely to be prior knowledge in this step, it is possible to significantly reduce the processing time related to ranking creation by scoring the learning model that follows. In addition, even when scoring a learning model based on rules, if a learning model that does not need to be compared is included as a candidate, the selection accuracy of the learning model may be reduced, but the possibility is minimized. Can be fastened.

次に、ステップＳ２３２では、クラウドサーバ１２のＣＰＵ１９１は、カメラ１０から受信した撮影画像からＰ（Ｐは２以上の整数）枚の撮影画像をモデル選択対象画像として選択する。カメラ１０から受信した撮影画像からＰ枚の撮影画像を選択する方法は特定の選択方法に限らない。例えば、ＣＰＵ１９１は、カメラ１０から受信した撮影画像からランダムにＰ枚の撮影画像を選択しても良いし、何らかの基準に従って選択しても良い。 Next, in step S232, the CPU 191 of the cloud server 12 selects P (P is an integer of 2 or more) captured images from the captured images received from the camera 10 as model selection target images. The method of selecting P shot images from the shot images received from the camera 10 is not limited to a specific selection method. For example, the CPU 191 may randomly select P shot images from the shot images received from the camera 10, or may select them according to some criteria.

次に、ステップＳ２３３では、ステップＳ２３２で選択したＰ枚の撮影画像を用いて、Ｍ個の候補学習モデルから１つを選択学習モデルとして選択するための処理が行われる。ステップＳ２３３における処理の詳細について、図２Ｃのフローチャートに従って説明する。 Next, in step S233, a process for selecting one of the M candidate learning models as the selective learning model is performed using the P captured images selected in step S232. The details of the process in step S233 will be described according to the flowchart of FIG. 2C.

ステップＳ２３３０では、クラウドサーバ１２のＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれについて、「Ｐ枚の撮影画像のそれぞれについて、該候補学習モデルを用いて該撮影画像からオブジェクトを検出する処理であるオブジェクト検出処理」を行う。 In step S2330, the CPU 191 of the cloud server 12 describes, for each of the M candidate learning models, "for each of the P captured images, an object that is a process of detecting an object from the captured image using the candidate learning model. "Detection processing" is performed.

これにより、Ｐ枚の撮影画像のそれぞれについて、Ｍ個の候補学習モデルのそれぞれの「該撮影画像に対するオブジェクト検出処理の結果」が得られる。本実施形態では、「撮影画像に対するオブジェクト検出処理の結果」は、該撮影画像から検出されたオブジェクトの画像領域（矩形領域、検出領域）の位置情報である。 As a result, for each of the P captured images, the "result of the object detection process for the captured image" of each of the M candidate learning models can be obtained. In the present embodiment, the "result of the object detection process for the captured image" is the position information of the image region (rectangular region, detection region) of the object detected from the captured image.

ステップＳ２３３１では、ＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれの「Ｐ枚の撮影画像のそれぞれに対するオブジェクト検出処理の結果」に対するスコアを求める。そしてＣＰＵ１９１は、該スコアに基づいてＭ個の候補学習モデルの順位付け（ランキング作成）を行って、Ｍ個の候補学習モデルからＮ（Ｎ≦Ｍ）個の候補学習モデルを選択する。 In step S2331, the CPU 191 obtains a score for each of the "results of object detection processing for each of the P captured images" of the M candidate learning models. Then, the CPU 191 ranks (ranks) M candidate learning models based on the score, and selects N (N ≦ M) candidate learning models from the M candidate learning models.

このとき、撮影画像にはアノテーション情報が無いため、正確な検出精度評価はできない。しかし、農場のように計画的にデザインされてメンテナンスされている対象では、以下の様なルールを利用してオブジェクト検出処理の精度を予測して評価することが可能である。候補学習モデルによるオブジェクト検出処理の結果に対するスコアは、例えば、以下のようにして求める。 At this time, since there is no annotation information in the captured image, accurate detection accuracy evaluation cannot be performed. However, for objects that are systematically designed and maintained, such as farms, it is possible to predict and evaluate the accuracy of object detection processing using the following rules. The score for the result of the object detection process by the candidate learning model is obtained, for example, as follows.

一般的な圃場では、図３（Ａ）、（Ｂ）に示したように、等間隔に農作物が植わっている。よって、図５（Ａ）、（Ｂ）で例示したアノテーション（矩形領域）のようにオブジェクトを検出をする際は、常に等しく画像の左端から右端まで矩形領域が連続して検出されるのが正常に検出した状態である。 In a general field, as shown in FIGS. 3 (A) and 3 (B), crops are planted at equal intervals. Therefore, when detecting an object as in the annotation (rectangular area) illustrated in FIGS. 5A and 5B, it is normal that the rectangular area is continuously detected from the left end to the right end of the image. It is the state detected in.

例えば、図５（Ａ）のように撮影画像の左端から右端まで全て果実等が収穫可能な領域と検出される場合、矩形領域５００のように生産領域が検出されるべきである。また、図５（Ｂ）のように、撮影画像内に非生産領域である矩形領域５０６がある場合も、撮影画像の左端から右端まで矩形領域５０５、５０６、５０７と検出されるべきである。もし撮影画像の条件に合わない学習モデルを用いて該撮影画像に対するオブジェクト検出処理を実施すると、上記の矩形領域のうち検出されない矩形領域が発生する可能性がある。撮影画像の条件から遠い条件に対応する学習モデルほど、このような可能性は高くなる。よって、候補学習モデルの評価を行う最も、簡易なスコアリング方法としては、例えば次のような方法が考えられる。 For example, when it is detected as a region where fruits and the like can be harvested from the left end to the right end of the photographed image as shown in FIG. 5A, the production region should be detected as in the rectangular region 500. Further, even when there is a rectangular region 506 which is a non-produced region in the captured image as shown in FIG. 5B, it should be detected as rectangular regions 505, 506, 507 from the left end to the right end of the captured image. If the object detection process for the captured image is performed using a learning model that does not meet the conditions of the captured image, there is a possibility that an undetected rectangular region may occur among the above rectangular regions. The more the learning model corresponds to the condition far from the condition of the captured image, the higher the possibility of such a situation. Therefore, as the simplest scoring method for evaluating the candidate learning model, for example, the following method can be considered.

着目候補学習モデルにより着目撮影画像からは複数のオブジェクトの検出領域が検出される。よって、該着目撮影画像の垂直方向に検出領域を探索して該検出領域がない領域の画素数Ｃｐをカウントし、該着目撮影画像の幅の画素数に対する画素数Ｃｐの割合を該着目撮影画像の罰則スコアとする。このようにして、着目候補学習モデルによりオブジェクト検出処理を行ったＰ枚の撮影画像のそれぞれについて罰則スコアを求め、該求めた罰則スコアの合計値を、該着目候補学習モデルのスコアとする。このような処理をＭ個の候補学習モデルのそれぞれについて行うことで、それぞれの候補学習モデルのスコアが確定する。そしてＭ個の候補学習モデルをスコアが小さい順に順位付けし、スコアが小さい順に上位Ｎ個の候補学習モデルを選択する。該選択の際には、「スコアが閾値未満」という条件を加えても良い。 The detection area of a plurality of objects is detected from the captured image of interest by the focus candidate learning model. Therefore, the detection region is searched in the vertical direction of the focus shooting image, the number of pixels Cp in the region without the detection region is counted, and the ratio of the number of pixels Cp to the number of pixels in the width of the focus shooting image is the focus shooting image. Penalty score. In this way, a penalty score is obtained for each of the P captured images subjected to the object detection process by the focus candidate learning model, and the total value of the obtained penalty scores is used as the score of the focus candidate learning model. By performing such processing for each of the M candidate learning models, the score of each candidate learning model is determined. Then, the M candidate learning models are ranked in ascending order of score, and the top N candidate learning models are selected in ascending order of score. At the time of the selection, the condition that "the score is less than the threshold value" may be added.

また、候補学習モデルのスコアとして、通常等間隔に植えられている木の幹部分の検出領域から類推されるスコアを求めても良い。木の幹は図５（Ａ）に示す如く、矩形領域５０１、５０２、５０３、５０４と凡そ等間隔に検出されるべきであるため、撮影画像の幅に対する「木の幹の領域の検出数」として想定される数は決まっている。想定される数より少ない／多い撮影画像は検出ミスを起こしている可能性が高いため、この検出数をスコアに反映させても良い。 Further, as the score of the candidate learning model, a score inferred from the detection region of the trunk portion of the tree normally planted at equal intervals may be obtained. As shown in FIG. 5A, the tree trunk should be detected at approximately equal intervals with the rectangular areas 501, 502, 503, and 504. Therefore, “the number of detected tree trunk areas” with respect to the width of the captured image. The expected number is fixed. Since there is a high possibility that a detection error has occurred in captured images that are less / more than the expected number, this number of detections may be reflected in the score.

そしてＣＰＵ１９１は、Ｐ枚の撮影画像、Ｍ個の候補学習モデルから選択したＮ個の候補学習モデルのそれぞれの「該Ｐ枚の撮影画像に対するオブジェクト検出処理の結果」、該Ｎ個の候補学習モデルに関する情報（モデル名など）、を情報処理装置１３に対して送信する。上記の如く、本実施形態では、「撮影画像に対するオブジェクト検出処理の結果」は、該撮影画像から検出されたオブジェクトの画像領域（矩形領域、検出領域）の位置情報であり、このような位置情報は、例えば、ｊｓｏｎ形式またはｔｘｔ形式等のファイルフォーマットのデータとして情報処理装置１３に対して送信される。 Then, the CPU 191 includes "results of object detection processing for the P captured images" of each of the P captured images and N candidate learning models selected from the M candidate learning models, and the N candidate learning models. Information (model name, etc.) is transmitted to the information processing apparatus 13. As described above, in the present embodiment, the "result of the object detection process for the captured image" is the position information of the image area (rectangular area, detection area) of the object detected from the captured image, and such position information. Is transmitted to the information processing apparatus 13 as data in a file format such as, for example, json format or txt format.

次に、選択されたＮ個の候補学習モデルから１つをユーザに選択させる。ステップＳ２３３１の処理の終了時点で未だ候補学習モデルはＮ個残っており、性能を比較する根拠とする出力は上記のＰ枚の撮影画像に対するオブジェクト検出処理の結果であるため、ユーザは、Ｎ×Ｐ枚の撮影画像に対するオブジェクト検出処理の結果を見比べなければならない。そのような状態で適切に１つの候補学習モデルを選択学習モデルとして選択する（１つに絞り込む）のは困難である。 Next, the user is made to select one of the selected N candidate learning models. At the end of the process of step S2331, N candidate learning models still remain, and the output used as the basis for comparing the performance is the result of the object detection process for the above P captured images. It is necessary to compare the results of the object detection processing for the P shot images. In such a state, it is difficult to appropriately select one candidate learning model as a selective learning model (narrow down to one).

よってステップＳ２３３２では、情報処理装置１３のＣＰＵ１３１は、Ｐ枚の撮影画像について、ユーザの主観による比較がしやすい情報提示のためのスコアリング（表示画像スコアリング）を行う。表示画像スコアリングでは、Ｐ枚の撮影画像のそれぞれについて、Ｎ個の候補学習モデル間で検出領域の配置パターンが大きく異なるほど大きいスコアを決定する。このようなスコアは、例えば、以下の式（４）を計算することで求めることができる。 Therefore, in step S2332, the CPU 131 of the information processing apparatus 13 performs scoring (display image scoring) for presenting information that can be easily compared by the user's subjectivity with respect to the captured images of P images. In the display image scoring, for each of the P captured images, the larger the score is determined as the arrangement pattern of the detection region is significantly different among the N candidate learning models. Such a score can be obtained, for example, by calculating the following equation (4).

ここで、Ｓｃｏｒｅ（ｚ）は、撮影画像Ｉｚに対するスコアである。Ｔ_Ｉｚ（Ｍａ、Ｍｂ）は、候補学習モデルＭａが撮影画像Ｉｚに対して行ったオブジェクト検出処理の結果（検出領域の配置パターン）と、候補学習モデルＭｂが撮影画像Ｉｚに対して行ったオブジェクト検出処理の結果（検出領域の配置パターン）と、の差に基づくスコアを求めるための関数である。このような関数には様々な関数が適用可能であり、特定の関数に限らない。例えば、候補学習モデルＭａが撮影画像Ｉｚから検出した検出領域Ｒａごとに、候補学習モデルＭｂが撮影画像Ｉｚから検出した検出領域Ｒｂのうち該検出領域Ｒａに最も近い検出領域Ｒｂ’の位置（例えば左上隅の位置および右下隅の位置）と該検出領域Ｒａの位置（例えば左上隅の位置および右下隅の位置）との差を求め、求めた差の合計を返す関数をＴ_Ｉｚ（Ｍａ、Ｍｂ）としても良い。 Here, Score (z) is a score for the captured image Iz. The T _Iz (Ma, Mb) is the result of the object detection process (arrangement pattern of the detection area) performed by the candidate learning model Ma on the captured image Iz, and the object performed by the candidate learning model Mb on the captured image Iz. This is a function for obtaining a score based on the difference between the result of the detection process (arrangement pattern of the detection area) and the difference. Various functions can be applied to such functions, and are not limited to specific functions. For example, for each detection region Ra detected by the candidate learning model Ma from the captured image Iz, the position of the detection region Rb'closest to the detection region Ra among the detection regions Rb detected by the candidate learning model Mb from the captured image Iz (for example,). A function that finds the difference between the position of the upper left corner and the position of the lower right corner) and the position of the detection area Ra (for example, the position of the upper left corner and the position of the lower right corner) and returns the sum of the obtained differences is _TIz (Ma, Mb). ) May be used.

上位Ｎ個の候補学習モデルによるオブジェクト検出処理の結果は多くの場合は類似することが多いため、無作為に取り出した画像で見比べてもほとんど差が無い場合が多く、学習モデルを選ぶ際の根拠にならない。よって上記の式（４）でスコアリングした上位の撮影画像だけを見ることによって容易に学習モデルの良し悪しを判断しやすくなる。 Since the results of object detection processing by the top N candidate learning models are often similar, there are many cases where there is almost no difference even when comparing randomly selected images, and the basis for selecting a learning model. do not become. Therefore, it is easy to judge whether the learning model is good or bad by looking only at the high-ranked captured images scored by the above equation (4).

ステップＳ２３３３では、情報処理装置１３のＣＰＵ１３１は、上記のＮ個の候補学習モデルのそれぞれについて、クラウドサーバ１２から受信したＰ枚の撮影画像のうちスコアが大きい順に上位Ｆ枚（上位から規定枚数）の撮影画像と、クラウドサーバ１２から受信した該撮影画像に対するオブジェクト検出処理の結果と、を表示装置１４に表示させる（表示制御）。その際、Ｆ枚の撮影画像はスコアが大きい順に左から並べて表示する。 In step S2333, the CPU 131 of the information processing apparatus 13 has the top F images (specified number from the top) in descending order of the score among the P shot images received from the cloud server 12 for each of the above N candidate learning models. The captured image of the above and the result of the object detection process for the captured image received from the cloud server 12 are displayed on the display device 14 (display control). At that time, the captured images of F images are displayed side by side from the left in descending order of score.

候補学習モデルごとの撮影画像およびオブジェクト検出処理の結果を表示したＧＵＩの表示例を図７（Ａ）に示す。図７（Ａ）では、Ｎ＝３、Ｆ＝４のケースについて示している。 FIG. 7A shows a display example of the GUI displaying the captured image and the result of the object detection process for each candidate learning model. FIG. 7A shows the case of N = 3 and F = 4.

最上の行には、スコアが最も大きい候補学習モデルのモデル名「Ｍ００２」がラジオボタン７０と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ００２」の候補学習モデルが該撮影画像から検出したオブジェクトの検出領域を示す枠が重ねて表示されている。 In the top row, the model name "M002" of the candidate learning model with the highest score is displayed together with the radio button 70, and on the right side, the top four captured images are arranged in order from the left in descending order of score. It is displayed. In the captured image, a frame showing the detection area of the object detected by the candidate learning model of the model name “M002” is displayed superimposed on the captured image.

中段の行には、スコアが２番目に大きい候補学習モデルのモデル名「Ｍ０１１」がラジオボタン７０と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ０１１」の候補学習モデルが該撮影画像から検出したオブジェクトの検出領域を示す枠が重ねて表示されている。 In the middle row, the model name "M011" of the candidate learning model with the second highest score is displayed together with the radio button 70, and on the right side, the top four captured images in descending order of score are displayed in order from the left. They are displayed side by side. In the captured image, a frame showing the detection area of the object detected by the candidate learning model of the model name “M011” is displayed superimposed on the captured image.

下段の行には、スコアが３番目に大きい候補学習モデルのモデル名「Ｍ００９」がラジオボタン７０と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ００９」の候補学習モデルが該撮影画像から検出したオブジェクトの検出領域を示す枠が重ねて表示されている。 In the lower row, the model name "M009" of the candidate learning model with the third highest score is displayed together with the radio button 70, and on the right side, the top four captured images in descending order of score are displayed in order from the left. They are displayed side by side. In the captured image, a frame showing the detection area of the object detected by the candidate learning model of the model name “M009” is displayed superimposed on the captured image.

なお、このＧＵＩでは、各々の候補学習モデルによるオブジェクト検出処理の結果を一瞥して比較しやすいように、同列に並ぶ撮影画像は同じ撮影画像になるように表示する。 In this GUI, the captured images arranged in the same row are displayed so as to be the same captured image so that the results of the object detection processing by each candidate learning model can be easily compared at a glance.

そして、Ｆ枚の撮影画像に対するＮ個の候補学習モデルによるオブジェクト検出処理の結果の違いをユーザは目視により確認し、Ｎ個の候補学習モデルのうち１つをユーザインターフェース１５を用いて選択する。 Then, the user visually confirms the difference in the result of the object detection processing by the N candidate learning models for the F images, and selects one of the N candidate learning models by using the user interface 15.

ステップＳ２３３４では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ操作、ユーザ入力）を受け付ける。ステップＳ２３３５では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われたか否かを判断する。 In step S2334, the CPU 131 of the information processing apparatus 13 accepts a candidate learning model selection operation (user operation, user input) by the user. In step S2335, the CPU 131 of the information processing apparatus 13 determines whether or not the selection operation (user input) of the candidate learning model by the user has been performed.

図７（Ａ）の場合、ユーザは、モデル名「Ｍ００２」の候補学習モデルを選択する場合には、最上の行におけるラジオボタン７０をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ０１１」の候補学習モデルを選択する場合には、中段の行におけるラジオボタン７０をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ００９」の候補学習モデルを選択する場合には、下段の行におけるラジオボタン７０をユーザインターフェース１５を用いて選択する。図７（Ａ）では、モデル名「Ｍ００２」に対応するラジオボタン７０が選択されているため、モデル名「Ｍ００２」の候補学習モデルが選択されたことを示す枠７４が表示される。 In the case of FIG. 7A, when the user selects the candidate learning model having the model name “M002”, the radio button 70 in the top row is selected by using the user interface 15. Further, when the user selects the candidate learning model having the model name "M011", the user selects the radio button 70 in the middle row using the user interface 15. Further, when the user selects the candidate learning model having the model name "M009", the user selects the radio button 70 in the lower row by using the user interface 15. In FIG. 7A, since the radio button 70 corresponding to the model name “M002” is selected, the frame 74 indicating that the candidate learning model having the model name “M002” is selected is displayed.

そしてユーザがユーザインターフェース１５を操作して決定ボタン７１を指示すると、ＣＰＵ１３１は、「ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた」と判断し、選択したラジオボタン７０に対応する候補学習モデルを選択学習モデルとして選択する。 Then, when the user operates the user interface 15 to instruct the enter button 71, the CPU 131 determines that "the user has performed the candidate learning model selection operation (user input)" and corresponds to the selected radio button 70. Select the candidate learning model as the selective learning model.

この判断の結果、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた場合には、処理はステップＳ２３３６に進み、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われていない場合には、処理はステップＳ２３３４に進む。 As a result of this determination, when the user has performed the candidate learning model selection operation (user input), the process proceeds to step S2336, and the user has not performed the candidate learning model selection operation (user input). The process proceeds to step S2334.

ステップＳ２３３６では、情報処理装置１３のＣＰＵ１３１は、最終的に学習モデルが１個のみ選択された状態であるのかを確認する。そして、最終的に学習モデルが１個のみ選択された状態である場合には、処理はステップＳ２４に進み、最終的に学習モデルが１個のみ選択された状態ではない場合には、処理はステップＳ２３３２に進む。 In step S2336, the CPU 131 of the information processing apparatus 13 finally confirms whether only one learning model is selected. Then, when only one learning model is finally selected, the process proceeds to step S24, and when finally only one learning model is not selected, the process proceeds to step S24. Proceed to S2332.

ここで、ユーザが図７（Ａ）の表示を見ただけでは１個に絞ることが出来なかった場合は、複数のラジオボタン７０を選択することで複数の候補学習モデルを選択するようにしても良い。例えば、ユーザがユーザインターフェース１５を操作して、図７（Ａ）においてモデル名「Ｍ００２」に対応するラジオボタン７０とモデル名「Ｍ０１１」に対応するラジオボタン７０とを選択して決定ボタン７１を指定した場合、選択したラジオボタン７０の数「２」をＮに設定して、処理はステップＳ２３３６を介してステップＳ２３３２に進む。この場合、ステップＳ２３３２以降では、Ｎ＝２、Ｆ＝４について同様の処理を行う。このようにして、最終的に選択される学習モデルの数が「１」になるまで処理を繰り返す。 Here, if the user cannot narrow down to one by just looking at the display of FIG. 7A, select a plurality of radio buttons 70 to select a plurality of candidate learning models. Is also good. For example, the user operates the user interface 15 to select the radio button 70 corresponding to the model name “M002” and the radio button 70 corresponding to the model name “M011” in FIG. 7A, and press the enter button 71. If specified, the number "2" of the selected radio buttons 70 is set to N, and the process proceeds to step S2332 via step S2336. In this case, in step S2332 and subsequent steps, the same processing is performed for N = 2 and F = 4. In this way, the process is repeated until the number of learning models finally selected becomes "1".

また、ユーザは、図７（Ａ）のＧＵＩの代わりに図７（Ｂ）のＧＵＩを用いて学習モデルを選択しても良い。図７（Ａ）のＧＵＩは、ユーザに直接どの学習モデルが良いのかを選択させるＧＵＩとなっている。これに対し、図７（Ｂ）のＧＵＩでは、それぞれの撮影画像にチェックボックス７２が設けられており、ユーザは、縦一列に並ぶ撮影画像列ごとに、該撮影画像列中の撮影画像のうちオブジェクト検出処理の結果が好ましいと判断した撮影画像のチェックボックス７２を、ユーザインターフェース１５を操作して指定してオンにする（チェックマークを付ける）。そしてユーザがユーザインターフェース１５を操作して決定ボタン７５を指示すると、情報処理装置１３のＣＰＵ１３１は、モデル名が「Ｍ００２」、「Ｍ０１１」、「Ｍ００９」の候補学習モデルのうち、チェックボックス７２がオンになっている撮影画像の数が最も多い候補学習モデルを選択学習モデルとして選択する。図７（Ｂ）の例では、モデル名が「Ｍ００２」の候補学習モデルの４枚の撮影画像のうち３つのチェックボックス７２がオンになっており、モデル名が「Ｍ０１１」の候補学習モデルの４枚の撮影画像のうち１つのチェックボックス７２がオンになっており、モデル名が「Ｍ００９」の候補学習モデルの候補学習モデルの４枚の撮影画像の何れのチェックボックス７２もオンになっていない。この場合は、モデル名が「Ｍ００２」の候補学習モデルが選択学習モデルとして選択されることになる。このようなＧＵＩによる選択学習モデルの選択方法は、例えば、Ｆの値が増加してユーザがどの候補学習モデルが最も良いか判断するのが難しい場合に有効である。 Further, the user may select the learning model by using the GUI of FIG. 7 (B) instead of the GUI of FIG. 7 (A). The GUI of FIG. 7A is a GUI that allows the user to directly select which learning model is better. On the other hand, in the GUI of FIG. 7B, a check box 72 is provided for each captured image, and the user can use the captured images in the captured image row for each captured image row arranged in a vertical row. The check box 72 of the captured image determined that the result of the object detection process is preferable is specified by operating the user interface 15 and turned on (a check mark is added). Then, when the user operates the user interface 15 to instruct the enter button 75, the CPU 131 of the information processing apparatus 13 has a check box 72 among the candidate learning models whose model names are "M002", "M011", and "M009". The candidate learning model with the largest number of captured images turned on is selected as the selection learning model. In the example of FIG. 7B, three check boxes 72 of the four captured images of the candidate learning model having the model name “M002” are selected, and the candidate learning model having the model name “M011” is selected. The check box 72 of one of the four captured images is turned on, and the check box 72 of any of the four captured images of the candidate learning model of the candidate learning model whose model name is "M009" is turned on. not. In this case, the candidate learning model whose model name is "M002" is selected as the selective learning model. Such a method of selecting a selection learning model by GUI is effective, for example, when the value of F increases and it is difficult for the user to determine which candidate learning model is the best.

なお、「チェックボックス７２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルが存在する場合には、ステップＳ２３３６で「最終的に学習モデルが１個のみ選択された状態ではない」と判断して、処理はステップＳ２３３２に進む。そしてステップＳ２３３２以降では、「チェックボックス７２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルを対象にして処理を行う。このような場合でも、最終的に選択される学習モデルの数が「１」になるまで処理を繰り返す。 If there is a candidate learning model with the same number or a small difference in "the number of captured images with the check box 72 selected", "in the state where only one learning model is finally selected" in step S2336. It is determined that there is no such thing, and the process proceeds to step S2332. Then, in step S2332 and subsequent steps, processing is performed on a candidate learning model having the same number or a small difference in "the number of captured images in which the check box 72 is selected". Even in such a case, the process is repeated until the number of learning models finally selected becomes "1".

また、より左に表示された撮影画像は、候補学習モデル間におけるオブジェクト検出処理の結果の差異がより大きい撮影画像であることから、より左に表示される撮影画像ほどより大きい重み値を割り当てても良い。この場合、候補学習モデルごとに、チェックボックス７２がオンになっている撮影画像の重み値の合計を求め、求めた合計が最も大きい候補学習モデルを選択学習モデルとして選択するようにしても良い。 Further, since the captured image displayed on the left side is a captured image in which the difference in the result of the object detection processing between the candidate learning models is large, the captured image displayed on the left side is assigned a larger weight value. Is also good. In this case, the total weight value of the captured images for which the check box 72 is selected may be obtained for each candidate learning model, and the candidate learning model having the largest total obtained may be selected as the selective learning model.

そして情報処理装置１３のＣＰＵ１３１は、どのような方法で選択学習モデルを選択したとしても、該選択学習モデルを示す情報（例えば選択学習モデルのモデル名）をクラウドサーバ１２に通知する。 Then, the CPU 131 of the information processing apparatus 13 notifies the cloud server 12 of information indicating the selective learning model (for example, the model name of the selective learning model) regardless of the method of selecting the selective learning model.

ステップＳ２４では、クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から通知された情報で特定される選択学習モデルを用いて、撮影画像（カメラ１０がクラウドサーバ１２および情報処理装置１３に送信した撮影画像）に対するオブジェクト検出処理を行う。 In step S24, the CPU 191 of the cloud server 12 uses a captured image (a captured image transmitted by the camera 10 to the cloud server 12 and the information processing device 13) using a selective learning model specified by the information notified from the information processing device 13. ) Is subject to object detection processing.

ステップＳ２５では、クラウドサーバ１２のＣＰＵ１９１は、ステップＳ２４におけるオブジェクト検出処理の結果として得られる検出領域から、目的としていた圃場の収量の予測や圃場全体に対する非生産率の計算等の分析処理を行う。この計算は、全撮影画像から検出された生産領域矩形と枯れ枝領域、病変領域等と判定された非生産領域の双方の領域を加味して行われる。 In step S25, the CPU 191 of the cloud server 12 performs analysis processing such as prediction of the yield of the target field and calculation of the non-production rate for the entire field from the detection area obtained as a result of the object detection processing in step S24. This calculation is performed by taking into account both the production area rectangle detected from all the captured images and the non-production area determined to be the dead branch area, the lesion area, and the like.

なお、本実施形態に係る学習モデルはディープラーニングによって学習されたモデルであるが、各種パラメータで定義されたルールベースによる検出器、ファジィ推論、遺伝的アルゴリズム、等の様々なオブジェクト検出技術を学習モデルとして利用しても良い。 Although the learning model according to this embodiment is a model learned by deep learning, various object detection techniques such as rule-based detectors defined by various parameters, fuzzy inference, and genetic algorithms are learned as learning models. You may use it as.

［第２の実施形態］
本実施形態以降では、第１の実施形態との差分について説明し、以下で特に触れない限りは第１の実施形態と同様であるものとする。本実施形態では、工場の生産ラインにおける外観検査を行うシステムを例にとり説明する。本実施形態に係るシステムは、検査対象である工業製品の異常領域を検出する。 [Second Embodiment]
Hereinafter, the differences from the first embodiment will be described, and the same as the first embodiment will be described unless otherwise specified below. In this embodiment, a system for performing an visual inspection on a production line of a factory will be described as an example. The system according to this embodiment detects an abnormal region of an industrial product to be inspected.

従来、工場の生産ラインにおける外観検査では、製造ラインごとに検査装置（製品の外観を撮影して検査する装置）の撮影条件等が綿密に調整されており、各製造ラインが立ち上がるごとに検査装置の設定も時間をかけて調整するのが一般的であった。しかし、近年では、顧客ニーズの多様化と市場の移り変わりに即座に対応することが製造現場に望まれている。そして、小ロットであっても短期間でラインを立ち上げて需要に見合う数量の製造を行い、さらに充分な供給が終了すると即座にラインを解体して次の製造ラインに備える、といったスピーディな対応へのニーズが高まっている。 Conventionally, in visual inspection on a factory production line, the shooting conditions of the inspection device (device that photographs and inspects the appearance of the product) are carefully adjusted for each production line, and the inspection device is used every time each production line starts up. It was common to adjust the setting of. However, in recent years, it has been desired for manufacturing sites to immediately respond to the diversification of customer needs and changes in the market. And even if it is a small lot, we will start up the line in a short period of time to manufacture the quantity that meets the demand, and when sufficient supply is completed, we will immediately dismantle the line and prepare for the next production line. There is a growing need for.

その際、従来同様に外観検査の設定を製造現場の専門家の経験や勘を基に毎度設定しているのでは迅速な立ち上げに対応しきれない。類似した製品の検査を過去に実施していたような場合、それらに関わる設定パラメータを保持しておいて、類似した検査を行う場合に該過去の設定パラメータを呼び出すことができれば、専門家の経験に頼ることなく誰でも検査装置の設定を行うことが可能になる。 At that time, if the visual inspection settings are set each time based on the experience and intuition of the specialists at the manufacturing site as in the past, it is not possible to respond to a quick start-up. Expert experience if similar product inspections have been performed in the past, if the setting parameters related to them can be retained and the past setting parameters can be recalled when performing similar inspections. Anyone can set the inspection device without relying on.

第１の実施形態と同様に、既に保持している学習モデルを新規製品の検査対象画像に割り当てることで同様に上記目的が達成される。よって、第２の実施形態にも上記の情報処理装置１３を適用することができる。 Similar to the first embodiment, the above object is similarly achieved by allocating the already held learning model to the inspection target image of the new product. Therefore, the above information processing apparatus 13 can be applied to the second embodiment as well.

本実施形態に係るシステムによる検査装置の設定処理（外観検査用の設定処理）について、図８Ａのフローチャートに従って説明する。なお、外観検査用の設定処理は、製造ラインにおける検査ステップの立ち上げ時に実施することを想定している。 The setting process (setting process for visual inspection) of the inspection device by the system according to the present embodiment will be described with reference to the flowchart of FIG. 8A. It is assumed that the setting process for visual inspection is performed at the start of the inspection step on the production line.

クラウドサーバ１２の外部記憶装置１９６には、撮影画像における外観検査を行うための学習モデル（外観検査用モデル／設定）が複数登録されており、それぞれの学習モデルは互いに異なる学習環境で学習されたモデルである。 A plurality of learning models (models / settings for visual inspection) for performing visual inspection on captured images are registered in the external storage device 196 of the cloud server 12, and each learning model is learned in different learning environments. It is a model.

カメラ１０は、外観検査の対象となる製品（検査対象製品）を撮影するためのカメラである。第１の実施形態と同様、カメラ１０は定期的もしくは不定期的に撮影を行うカメラであっても良いし、動画像を撮影するカメラであっても良い。撮影画像から検査対象製品における正確な異常領域の検出を行うために、異常領域を含む検査対象製品が検査工程に入ってきた場合は、可能な限り異常領域が強調されるような条件で撮影されることが望ましい。カメラ１０は、検査対象製品を複数の条件で撮影するのであればマルチカメラであっても良い。 The camera 10 is a camera for photographing a product (product to be inspected) to be visually inspected. Similar to the first embodiment, the camera 10 may be a camera that shoots images periodically or irregularly, or may be a camera that shoots moving images. In order to accurately detect the abnormal area in the product to be inspected from the captured image, when the product to be inspected including the abnormal area enters the inspection process, the image is taken under the condition that the abnormal area is emphasized as much as possible. Is desirable. The camera 10 may be a multi-camera as long as the product to be inspected is photographed under a plurality of conditions.

ステップＳ８０では、カメラ１０は、検査対象製品を撮影することで該検査対象製品の撮影画像を生成する。ステップＳ８１では、カメラ１０は、ステップＳ２０で生成した撮影画像を通信網１１を介してクラウドサーバ１２および情報処理装置１３に対して送信する。 In step S80, the camera 10 captures the inspection target product to generate a captured image of the inspection target product. In step S81, the camera 10 transmits the captured image generated in step S20 to the cloud server 12 and the information processing device 13 via the communication network 11.

ステップＳ８２では、情報処理装置１３のＣＰＵ１３１は、カメラ１０が撮影した検査対象製品などに関する情報（検査対象製品の部品名や材質、製造年月日、撮影時の撮像系パラメータ、ロット番号や気温、湿度等）を検査対象製品パラメータとして取得する。例えば、ＣＰＵ１３１は、ＧＵＩを表示装置１４に表示させて、ユーザからの検査対象製品パラメータの入力を受け付ける。そしてユーザがユーザインターフェース１５を操作して登録指示を入力すると、情報処理装置１３のＣＰＵ１３１は、ＧＵＩにおいて入力された上記の各項目の検査対象製品パラメータをクラウドサーバ１２に対して送信する。クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から送信された検査対象製品パラメータを外部記憶装置１９６に保存（登録）する。 In step S82, the CPU 131 of the information processing apparatus 13 has information on the product to be inspected taken by the camera 10 (part name and material of the product to be inspected, date of manufacture, imaging system parameters at the time of shooting, lot number and temperature, etc.). Humidity, etc.) is acquired as the product parameter to be inspected. For example, the CPU 131 displays the GUI on the display device 14 and accepts the input of the inspection target product parameter from the user. Then, when the user operates the user interface 15 and inputs a registration instruction, the CPU 131 of the information processing apparatus 13 transmits the inspection target product parameters of each of the above items input in the GUI to the cloud server 12. The CPU 191 of the cloud server 12 stores (registers) the inspection target product parameters transmitted from the information processing device 13 in the external storage device 196.

ステップＳ８３では、撮影画像から上記の検査対象製品を検出するために用いる学習モデルを選択するための処理が行われる。ステップＳ８３における処理の詳細について、図８Ｂのフローチャートに従って説明する。 In step S83, a process for selecting a learning model to be used for detecting the above-mentioned inspection target product from the captured image is performed. The details of the process in step S83 will be described with reference to the flowchart of FIG. 8B.

ステップＳ８３１では、クラウドサーバ１２のＣＰＵ１９１は、外部記憶装置１９６に保存しているＥ個の学習モデルのうち候補となるＭ個の学習モデル（候補学習モデル）を選択する。ＣＰＵ１９１は、外部記憶装置１９６に登録されている検査対象製品パラメータから第１の実施形態と同様にクエリパラメータを生成し、該クエリパラメータが示す環境と類似する環境について学習した学習モデル（過去の類似した検査で用いた学習モデル）を選択する。 In step S831, the CPU 191 of the cloud server 12 selects M learning models (candidate learning models) that are candidates from the E learning models stored in the external storage device 196. The CPU 191 generates a query parameter from the inspection target product parameter registered in the external storage device 196 in the same manner as in the first embodiment, and learns about an environment similar to the environment indicated by the query parameter (similar to the past). Select the learning model) used in the test.

クエリパラメータに「部品名」として「基盤」が含まれている場合、過去の基盤検査に用いられた学習モデルが選ばれやすくなり、さらに「材質」として「ガラスエポキシ」が含まれている場合、ガラスエポキシ基盤の検査に用いられた学習モデルが選ばれやすくなる。 If the query parameter includes "base" as the "part name", it will be easier to select the learning model used in the past base inspection, and if "glass epoxy" is included as the "material". The learning model used to inspect the glass epoxy substrate is easier to choose.

ステップＳ８３１でも第１の実施形態と同様、学習モデルのパラメータセットとクエリパラメータとを用いてＭ個の候補学習モデルを選択するが、その際には、第１の実施形態と同様、上記の式（１）を用いる。 In step S831, as in the first embodiment, M candidate learning models are selected using the parameter set of the learning model and the query parameters, but in that case, the above equation is used as in the first embodiment. (1) is used.

次に、ステップＳ８３２では、クラウドサーバ１２のＣＰＵ１９１は、カメラ１０から受信した撮影画像からＰ枚の撮影画像をモデル選択対象画像として選択する。例えば、本製造ラインの本検査工程に流れてくる製品をランダムに選択して実運用時と同様の設定でカメラ１０で撮影した撮影画像からＰ枚の撮影画像を取得する。通常、製造ラインで発生する異常品の数は少ないため、該工程で撮影する製品の数が少ない場合は以降のステップにおける処理が良く機能しない。よって、目安として数百個以上の製品の撮影が望ましい。 Next, in step S832, the CPU 191 of the cloud server 12 selects P shot images from the shot images received from the camera 10 as model selection target images. For example, products flowing into the main inspection process of the main production line are randomly selected, and P shot images are acquired from the shot images taken by the camera 10 with the same settings as in the actual operation. Normally, the number of abnormal products generated on the production line is small, so if the number of products to be photographed in the process is small, the processing in the subsequent steps does not work well. Therefore, as a guide, it is desirable to shoot several hundred or more products.

次に、ステップＳ８３３では、ステップＳ８３２で選択したＰ枚の撮影画像を用いて、Ｍ個の候補学習モデルから１つを選択学習モデルとして選択するための処理が行われる。ステップＳ８３３における処理の詳細について、図８Ｃのフローチャートに従って説明する。 Next, in step S833, a process for selecting one of the M candidate learning models as the selective learning model is performed using the P captured images selected in step S832. The details of the process in step S833 will be described according to the flowchart of FIG. 8C.

ステップＳ８３３０では、クラウドサーバ１２のＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれについて、「Ｐ枚の撮影画像のそれぞれについて、該候補学習モデルを用いて該撮影画像からオブジェクトを検出する処理であるオブジェクト検出処理」を行う。本実施形態でも、撮影画像に対するオブジェクト検出処理の結果は、該撮影画像から検出されたオブジェクトの画像領域（矩形領域、検出領域）の位置情報である。 In step S833, the CPU 191 of the cloud server 12 describes, for each of the M candidate learning models, "for each of the P captured images, an object that is a process of detecting an object from the captured image using the candidate learning model. "Detection processing" is performed. Also in this embodiment, the result of the object detection process for the captured image is the position information of the image region (rectangular region, detection region) of the object detected from the captured image.

ステップＳ８３３１では、ＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれの「Ｐ枚の撮影画像のそれぞれに対するオブジェクト検出処理の結果」に対するスコアを求める。そしてＣＰＵ１９１は、該スコアに基づいてＭ個の候補学習モデルの順位付け（ランキング作成）を行って、Ｍ個の候補学習モデルからＮ個の候補学習モデルを選択する。候補学習モデルによるオブジェクト検出処理の結果に対するスコアは、例えば、以下のようにして求める。 In step S8331, the CPU 191 obtains a score for each of the "results of object detection processing for each of the P captured images" of the M candidate learning models. Then, the CPU 191 ranks M candidate learning models (ranking creation) based on the score, and selects N candidate learning models from the M candidate learning models. The score for the result of the object detection process by the candidate learning model is obtained, for example, as follows.

例えば、プリント基板上の異常を検出するようなタスクにおいて、固定プリントパターン上の各種特定局所パターンに対してオブジェクト検出処理を実施するものとする。ここで、特定の学習モデルでは正常品の撮影画像からは、図９（Ａ）のような検出領域９０１～９０６が得られるとする。製造ラインで生産される製品の異常の発生頻度は極めて少ないため、上記タスクを実施する上で良い学習モデルとは、想定される撮影画像のバラつきに対して安定した結果を出力できる学習モデルである。例えば、エリアセンサ側の環境の変動によって僅かに製品を撮影した画像の見た目が変わることによって図９（Ｂ）のように検出領域９０１～９０６のうち検出領域９０６が検出できなくなることがある。このような場合、僅かな違いしか無い入力に対して検出領域が変わる学習モデルの評価スコアに対しては罰則を与えるべきである。 For example, in a task of detecting an abnormality on a printed circuit board, an object detection process is performed on various specific local patterns on a fixed printed pattern. Here, it is assumed that the detection regions 901 to 906 as shown in FIG. 9A can be obtained from the captured image of the normal product in the specific learning model. Since the frequency of abnormalities in products produced on the production line is extremely low, a good learning model for performing the above tasks is a learning model that can output stable results against possible variations in captured images. .. For example, the appearance of the image obtained by photographing the product may be slightly changed due to the change in the environment on the area sensor side, so that the detection area 906 of the detection areas 901 to 906 may not be detected as shown in FIG. 9B. In such cases, penalties should be given to the evaluation score of the learning model whose detection area changes for inputs with only slight differences.

よって、例えば、クラウドサーバ１２のＣＰＵ１９１は、Ｍ個の候補学習モデルのそれぞれについて、Ｐ枚の撮影画像間で該候補学習モデルによる検出領域の配置パターンが大きく異なるほど大きいスコアを決定する。このようなスコアは、例えば、上記の式（４）を計算することで求めることができる。そしてＭ個の候補学習モデルをスコアが小さい順に順位付けし、スコアが小さい順に上位Ｎ個の候補学習モデルを選択する。該選択の際には、「スコアが閾値未満」という条件を加えても良い。 Therefore, for example, the CPU 191 of the cloud server 12 determines, for each of the M candidate learning models, a larger score as the arrangement pattern of the detection region by the candidate learning model differs greatly among the P captured images. Such a score can be obtained, for example, by calculating the above equation (4). Then, the M candidate learning models are ranked in ascending order of score, and the top N candidate learning models are selected in ascending order of score. At the time of the selection, the condition that "the score is less than the threshold value" may be added.

ステップＳ８３３２では、情報処理装置１３のＣＰＵ１３１は、Ｐ枚の撮影画像について、ユーザの主観による比較がしやすい情報提示のためのスコアリング（表示画像スコアリング）を、第１の実施形態（ステップＳ２３３２）と同様にして行う。 In step S8332, the CPU 131 of the information processing apparatus 13 performs scoring (display image scoring) for presenting information that can be easily compared by the user's subjectivity for the P images taken in the first embodiment (step S2332). ).

ステップＳ８３３３では、情報処理装置１３のＣＰＵ１３１は、ステップＳ８３３１で選択したＮ個の候補学習モデルのそれぞれについて、クラウドサーバ１２から受信したＰ枚の撮影画像のうちスコアが大きい順に上位Ｆ枚の撮影画像と、クラウドサーバ１２から受信した該撮影画像に対するオブジェクト検出処理の結果と、を表示装置１４に表示させる。その際、Ｆ枚の撮影画像はスコアが大きい順に左から並べて表示する。 In step S8333, the CPU 131 of the information processing apparatus 13 takes the top F captured images in descending order of the scores among the P captured images received from the cloud server 12 for each of the N candidate learning models selected in step S8331. And the result of the object detection process for the captured image received from the cloud server 12 are displayed on the display device 14. At that time, the captured images of F images are displayed side by side from the left in descending order of score.

候補学習モデルごとの撮影画像およびオブジェクト検出処理の結果を表示したＧＵＩの表示例を図１０（Ａ）に示す。図１０（Ａ）では、Ｎ＝３、Ｆ＝４のケースについて示している。 FIG. 10A shows a display example of the GUI displaying the captured image and the result of the object detection process for each candidate learning model. FIG. 10A shows the case of N = 3 and F = 4.

最上の行には、スコアが最も大きい候補学習モデルのモデル名「Ｍ００５」がラジオボタン１００と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ００５」の候補学習モデルが該撮影画像から検出した検出領域を示す枠が重ねて表示されている。 In the top row, the model name "M005" of the candidate learning model with the highest score is displayed together with the radio button 100, and on the right side, the top four captured images are arranged in order from the left in descending order of score. It is displayed. In the captured image, a frame showing a detection area detected by the candidate learning model of the model name “M005” is displayed superimposed on the captured image.

中段の行には、スコアが２番目に大きい候補学習モデルのモデル名「Ｍ０２３」がラジオボタン１００と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ０２３」の候補学習モデルが該撮影画像から検出した検出領域を示す枠が重ねて表示されている。 In the middle row, the model name "M023" of the candidate learning model with the second highest score is displayed together with the radio button 100, and on the right side, the top four captured images in descending order of score are displayed in order from the left. They are displayed side by side. In the captured image, a frame showing a detection area detected by the candidate learning model of the model name “M023” from the captured image is superimposed and displayed.

下段の行には、スコアが３番目に大きい候補学習モデルのモデル名「Ｍ０１４」がラジオボタン１００と共に表示されており、その右側には、スコアが大きい順に上位４枚の撮影画像が左から順に並んで表示されている。該撮影画像にはモデル名「Ｍ０１４」の候補学習モデルが該撮影画像から検出した検出領域を示す枠が重ねて表示されている。 In the lower row, the model name "M014" of the candidate learning model with the third highest score is displayed together with the radio button 100, and on the right side, the top four captured images in descending order of score are displayed in order from the left. They are displayed side by side. In the captured image, a frame showing a detection area detected by the candidate learning model of the model name “M014” is displayed superimposed on the captured image.

この場合、検出領域の配置パターンの違いについては、製品外観がほとんど固定であり、多くは正常品であることが多いため、図１０（Ａ）に示されるようになる。Ｆ枚の撮影画像はスコアが大きい順に左から並べて表示するが、スコアの大きいものは個別撮影時の撮影条件の違いが大きかった場合もしくは異常領域を含む個体であった場合になる傾向がある。よって、ユーザは事前に製品の撮影画像に対して異常領域へのアノテーション作業を実施するほか、数多くの製品から欠陥品を手作業で探してから検査装置の設定を行っていた従来の方式に比べ、該作業を一切実施しなくともこのＧＵＩを見るだけで異常領域を含む可能性のある製品の撮影画像から優先的にユーザに提示することができるため、省力化になる。ユーザは図１０（Ａ）のＧＵＩでオブジェクト検出処理の結果を見比べながら、正しく異常領域を検出できている学習モデルを選択すれば良い。 In this case, the difference in the arrangement pattern of the detection region is as shown in FIG. 10A because the appearance of the product is almost fixed and most of them are normal products. The F shot images are displayed side by side in descending order of score, but the one with the highest score tends to be the case where the difference in shooting conditions at the time of individual shooting is large or the individual contains an abnormal region. Therefore, compared to the conventional method in which the user manually annotates the photographed image of the product to the abnormal area and manually searches for defective products from many products before setting the inspection device. Even if the work is not performed at all, the photographed image of the product that may include an abnormal region can be preferentially presented to the user just by looking at this GUI, which saves labor. The user may select a learning model that can correctly detect the abnormal region while comparing the results of the object detection process with the GUI of FIG. 10 (A).

ステップＳ８３３４では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ入力）を受け付ける。ステップＳ８３３５では、情報処理装置１３のＣＰＵ１３１は、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われたか否かを判断する。 In step S8334, the CPU 131 of the information processing apparatus 13 accepts a candidate learning model selection operation (user input) by the user. In step S8335, the CPU 131 of the information processing apparatus 13 determines whether or not the selection operation (user input) of the candidate learning model by the user has been performed.

図１０（Ａ）の場合、ユーザは、モデル名「Ｍ００５」の候補学習モデルを選択する場合には、最上の行におけるラジオボタン１００をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ０２３」の候補学習モデルを選択する場合には、中段の行におけるラジオボタン１００をユーザインターフェース１５を用いて選択する。また、ユーザは、モデル名「Ｍ０１４」の候補学習モデルを選択する場合には、下段の行におけるラジオボタン１００をユーザインターフェース１５を用いて選択する。図１０（Ａ）では、モデル名「Ｍ００５」に対応するラジオボタン１００が選択されているため、モデル名「Ｍ００５」の候補学習モデルが選択されたことを示す枠１０４が表示される。 In the case of FIG. 10A, when the user selects the candidate learning model having the model name “M005”, the radio button 100 in the top row is selected by using the user interface 15. Further, when the user selects the candidate learning model having the model name "M023", the user selects the radio button 100 in the middle row using the user interface 15. Further, when the user selects the candidate learning model having the model name "M014", the user selects the radio button 100 in the lower row by using the user interface 15. In FIG. 10A, since the radio button 100 corresponding to the model name “M005” is selected, the frame 104 indicating that the candidate learning model having the model name “M005” is selected is displayed.

そしてユーザがユーザインターフェース１５を操作して決定ボタン１０１を指示すると、ＣＰＵ１３１は、「ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた」と判断し、選択したラジオボタン１００に対応する候補学習モデルを選択学習モデルとして選択する。 Then, when the user operates the user interface 15 to instruct the decision button 101, the CPU 131 determines that "the user has performed the candidate learning model selection operation (user input)" and corresponds to the selected radio button 100. Select the candidate learning model as the selective learning model.

この判断の結果、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われた場合には、処理はステップＳ８３３６に進み、ユーザによる候補学習モデルの選択操作（ユーザ入力）が行われていない場合には、処理はステップＳ８３３４に進む。 As a result of this determination, when the user has performed the candidate learning model selection operation (user input), the process proceeds to step S8336, and the user has not performed the candidate learning model selection operation (user input). The process proceeds to step S8334.

ステップＳ８３３６では、情報処理装置１３のＣＰＵ１３１は、最終的に学習モデルが「ユーザが希望する数」だけ選択された状態であるのかを確認する。そして、最終的に学習モデルが「ユーザが希望する数」だけ選択された状態である場合には、処理はステップＳ８４に進み、最終的に学習モデルが「ユーザが希望する数」だけ選択された状態ではない場合には、処理はステップＳ８３３２に進む。 In step S8336, the CPU 131 of the information processing apparatus 13 finally confirms whether or not the learning model is selected by "the number desired by the user". Then, when the learning model is finally selected by "the number desired by the user", the process proceeds to step S84, and finally the learning model is selected by "the number desired by the user". If it is not in the state, the process proceeds to step S8332.

ここで、「ユーザが希望する数」とは、主に外観検査にかけてよい時間（タクトタイム）に応じて決定する。例えば、「ユーザが希望する数」が２個の場合は、１個の学習モデルで低周波の異常領域を検出し、もう一方の学習モデルで高周波欠陥を検出するなど検出対象の傾向を変えることで幅広い検出が可能となる場合がある。 Here, the "number desired by the user" is mainly determined according to the time (tact time) that can be spent on the visual inspection. For example, when the "number desired by the user" is two, the tendency of the detection target is changed by detecting a low frequency abnormal region with one learning model and detecting a high frequency defect with the other learning model. May enable a wide range of detection.

ここで、ユーザが図１０（Ａ）の表示を見ただけでは「ユーザが希望する数」に絞ることが出来なかった場合は、複数のラジオボタン１００を選択することで複数の候補学習モデルを選択するようにしても良い。例えば、「ユーザが希望する数」が「１」で、選択したラジオボタン１００の数が２の場合、Ｎ＝２として処理はステップＳ８３３６を介してステップＳ８３３２に進む。この場合、ステップＳ８３３２以降では、Ｎ＝２、Ｆ＝４について同様の処理を行う。このようにして、最終的に選択される学習モデルの数が「ユーザが希望する数」になるまで処理を繰り返す。 Here, if the user cannot narrow down to the "number desired by the user" just by looking at the display of FIG. 10A, a plurality of candidate learning models can be selected by selecting a plurality of radio buttons 100. You may choose. For example, when the "number desired by the user" is "1" and the number of selected radio buttons 100 is 2, the process proceeds to step S8332 via step S8336 with N = 2. In this case, in step S8332 and subsequent steps, the same processing is performed for N = 2 and F = 4. In this way, the process is repeated until the number of learning models finally selected reaches the "number desired by the user".

また、ユーザは、図１０（Ａ）のＧＵＩの代わりに図１０（Ｂ）のＧＵＩを用いて学習モデルを選択しても良い。図１０（Ａ）のＧＵＩは、ユーザに直接どの学習モデルが良いのかを選択させるＧＵＩとなっている。これに対し、図１０（Ｂ）のＧＵＩでは、それぞれの撮影画像にチェックボックス１０２が設けられており、ユーザは、縦一列に並ぶ撮影画像列ごとに、該撮影画像列中の撮影画像のうちオブジェクト検出処理の結果が好ましいと判断した撮影画像のチェックボックス１０２を、ユーザインターフェース１５を操作して指定してオンにする（チェックマークを付ける）。そしてユーザがユーザインターフェース１５を操作して決定ボタン１０１５を指定すると、情報処理装置１３のＣＰＵ１３１は、モデル名が「Ｍ００５」、「Ｍ０２３」、「Ｍ０１４」の候補学習モデルのうち、チェックボックス１０２がオンになっている撮影画像の数が最も多い候補学習モデルを選択学習モデルとして選択する。図１０（Ｂ）の例では、モデル名が「Ｍ００５」の候補学習モデルの４枚の撮影画像のうち２つのチェックボックス１０２がオンになっており、モデル名が「Ｍ０２３」の候補学習モデルの４枚の撮影画像のうち１つのチェックボックス１０２がオンになっており、モデル名が「Ｍ０１４」の候補学習モデルの４枚の撮影画像のうち１つのチェックボックス１０２がオンになっている。この場合は、モデル名が「Ｍ００５」の候補学習モデルが選択学習モデルとして選択されることになる。このようなＧＵＩによる選択学習モデルの選択方法は、例えば、Ｆの値が増加してユーザがどの候補学習モデルが最も良いか判断するのが難しい場合に有効である。 Further, the user may select the learning model by using the GUI of FIG. 10 (B) instead of the GUI of FIG. 10 (A). The GUI of FIG. 10A is a GUI that allows the user to directly select which learning model is better. On the other hand, in the GUI of FIG. 10B, a check box 102 is provided for each captured image, and the user can use the captured images in the captured image row for each captured image row arranged in a vertical row. The check box 102 of the captured image for which the result of the object detection process is determined to be preferable is specified by operating the user interface 15 and turned on (a check mark is added). Then, when the user operates the user interface 15 to specify the enter button 1015, the CPU 131 of the information processing apparatus 13 has the check box 102 among the candidate learning models whose model names are "M005", "M023", and "M014". The candidate learning model with the largest number of captured images turned on is selected as the selection learning model. In the example of FIG. 10B, two check boxes 102 of the four captured images of the candidate learning model having the model name “M005” are selected, and the candidate learning model having the model name “M023” is selected. One of the four captured images has the check box 102 turned on, and one of the four captured images of the candidate learning model whose model name is "M014" has the check box 102 turned on. In this case, the candidate learning model whose model name is "M005" is selected as the selective learning model. Such a method of selecting a selection learning model by GUI is effective, for example, when the value of F increases and it is difficult for the user to determine which candidate learning model is the best.

図１０（Ｂ）のＧＵＩにおいて最終的に「ユーザが希望する数」の学習モデルに絞り込む最も簡単な方法は、オンになっているチェックボックスの数が多い順に上位から「ユーザが希望する数」までを選択する方法である。 In the GUI of FIG. 10B, the easiest way to finally narrow down to the learning model of "the number desired by the user" is to "the number desired by the user" from the top in descending order of the number of checked boxes. It is a method to select up to.

なお、「チェックボックス７２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルが存在する場合には、ステップＳ８３３６で「最終的に学習モデルが「ユーザが希望する数」だけ選択された状態ではない」と判断して、処理はステップＳ８３３２に進む。そしてステップＳ８３３２以降では、「チェックボックス１０２がオンになっている撮影画像の数」が同数または僅差の候補学習モデルを対象にして処理を行う。このような場合でも、最終的に選択される学習モデルの数が「ユーザが希望する数」になるまで処理を繰り返す。 If there is a candidate learning model with the same number or a small difference in "the number of captured images with the check box 72 selected", in step S8336, "finally, the learning model is only the" number desired by the user ". It is determined that the state is not selected, and the process proceeds to step S8332. Then, in step S8332 and subsequent steps, processing is performed on a candidate learning model having the same number or a small difference in "the number of captured images in which the check box 102 is selected". Even in such a case, the process is repeated until the number of learning models finally selected becomes "the number desired by the user".

ステップＳ８４では、クラウドサーバ１２のＣＰＵ１９１は、情報処理装置１３から通知された情報で特定される選択学習モデルを用いて、撮影画像（カメラ１０がクラウドサーバ１２および情報処理装置１３に送信した撮影画像）に対するオブジェクト検出処理を行う。そしてクラウドサーバ１２のＣＰＵ１９１は、オブジェクト検出処理の結果として得られる検出領域から、最終的な検査装置の設定を行う。ここで設定された学習モデルおよび各種パラメータによって実際に製造ラインが立ち上がった段階で検査が実行される。 In step S84, the CPU 191 of the cloud server 12 uses a captured image (a captured image transmitted by the camera 10 to the cloud server 12 and the information processing device 13) using a selective learning model specified by the information notified from the information processing device 13. ) Is subject to object detection processing. Then, the CPU 191 of the cloud server 12 sets the final inspection device from the detection area obtained as a result of the object detection process. The inspection is executed when the production line is actually started up by the learning model and various parameters set here.

＜変形例＞
上記の各実施形態は、目的とする検出・識別処理を実施するタスクにおいて、新規対象への検出・識別処理を行う場合に都度、学習モデルの学習や設定の調整を行うコストを低減させるための技術の一例である。よって、上記の各実施形態にて説明した技術は、農作物の収量の予測や修繕領域検出、検査対象である工業製品の異常領域の検出等に適用することに限定されない。対象は農業、工業、水産業やその他幅広い分野において適用される。 <Modification example>
Each of the above embodiments is for reducing the cost of learning the learning model and adjusting the settings each time the detection / identification process for a new target is performed in the task of performing the target detection / identification process. It is an example of technology. Therefore, the techniques described in each of the above embodiments are not limited to being applied to the prediction of the yield of agricultural products, the detection of repair areas, the detection of abnormal areas of industrial products to be inspected, and the like. The subject applies to agriculture, industry, fisheries and a wide range of other fields.

また、上記のラジオボタンやチェックボックスは、ユーザが対象を選択するための選択部の一例として表示するものであり、同様の効果を奏するものであれば、他の表示アイテムを代わりに表示させても良い。また、上述の実施形態においてオブジェクト検出処理に用いる学習モデルをユーザ操作に基づいて選択する構成を説明した（Ｓ２４）。しかしながら、これに限られず、オブジェクト検出処理に用いる学習モデルを自動で選択する構成としてもよい。例えば、スコアが最も大きい候補学習モデルをオブジェクト検出処理に用いる学習モデルとして自動で選択する構成としてもよい。 In addition, the above radio buttons and check boxes are displayed as an example of a selection unit for the user to select a target, and if they have the same effect, other display items are displayed instead. Is also good. Further, the configuration in which the learning model used for the object detection process in the above-described embodiment is selected based on the user operation has been described (S24). However, the present invention is not limited to this, and a learning model used for the object detection process may be automatically selected. For example, the candidate learning model with the highest score may be automatically selected as the learning model used for the object detection process.

また、上記の説明における各処理の主体は一例である。例えば、クラウドサーバ１２のＣＰＵ１９１が行うものとして説明した処理の一部若しくは全部を情報処理装置１３のＣＰＵ１３１が行うようにしても良い。また、情報処理装置１３のＣＰＵ１３１が行うものとして説明した処理の一部若しくは全部をクラウドサーバ１２のＣＰＵ１９１が行うようにしても良い。 Further, the subject of each process in the above description is an example. For example, the CPU 131 of the information processing apparatus 13 may perform a part or all of the processing described as being performed by the CPU 191 of the cloud server 12. Further, the CPU 191 of the cloud server 12 may perform a part or all of the processing described as being performed by the CPU 131 of the information processing apparatus 13.

また、上記の説明では、各実施形態のシステムが分析処理を行うものとして説明した。しかし、分析処理の主体もまた上記の各実施形態のシステムに限らず、例えば、分析処理は他の装置／システムが行うようにしても良い。 Further, in the above description, it has been described that the system of each embodiment performs the analysis process. However, the subject of the analysis process is not limited to the system of each of the above-described embodiments, and for example, the analysis process may be performed by another device / system.

また、上記の各実施形態や変形例で使用した数値、処理タイミング、処理順、処理の主体、データ（情報）の構成／送信先／送信元などは、具体的な説明を行うために一例として挙げたものであり、このような一例に限定することを意図したものではない。 In addition, the numerical values, processing timing, processing order, processing subject, data (information) configuration / destination / source, etc. used in each of the above embodiments and modifications are given as examples for specific explanation. It is just an example and is not intended to be limited to such an example.

また、以上説明した各実施形態や変形例の一部若しくは全部を適宜組み合わせて使用しても構わない。また、以上説明した各実施形態や変形例の一部若しくは全部を選択的に使用しても構わない。 In addition, some or all of the embodiments and modifications described above may be used in combination as appropriate. In addition, some or all of the embodiments and modifications described above may be selectively used.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

発明は上記実施形態に制限されるものではなく、発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、発明の範囲を公にするために請求項を添付する。 The invention is not limited to the above embodiment, and various modifications and modifications can be made without departing from the spirit and scope of the invention. Therefore, a claim is attached to publicize the scope of the invention.

１０：カメラ１１：通信網１２：クラウドサーバ１３：情報処理装置１４：表示装置１５：ユーザインターフェース１３１：ＣＰＵ１３２：ＲＡＭ１３３：ＲＯＭ１３４：出力Ｉ／Ｆ１３５：入力Ｉ／Ｆ１９１：ＣＰＵ１９２：ＲＡＭ１９３：ＲＯＭ１９４：操作部１９５：表示部１９６：外部記憶装置１９７：Ｉ／Ｆ１９８：システムバス 10: Camera 11: Communication network 12: Cloud server 13: Information processing device 14: Display device 15: User interface 131: CPU 132: RAM 133: ROM 134: Output I / F 135: Input I / F 191: CPU 192: RAM 193: ROM 194: Operation unit 195: Display unit 196: External storage device 197: I / F 198: System bus

Claims

A first-choice means for selecting one or more learning models as candidate learning models from a plurality of learning models learned in different learning environments based on information related to object photography.
A second selection means for selecting one or more candidate learning models from the candidate learning model based on the result of object detection processing by the candidate learning model selected by the first selection means.
An information processing apparatus including a detection unit that performs object detection processing on a captured image of the object by using at least one of the candidate learning models selected by the second selection means. ..

The first selection means generates a query parameter based on the information, and selects one or more learning models trained in an environment similar to the environment indicated by the query parameter from the plurality of learning models as a candidate learning model. The information processing apparatus according to claim 1.

The second selection means obtains a score based on the result of object detection processing by the candidate learning model for each candidate learning model selected by the first selection means, and the first selection means selects based on the score. The information processing apparatus according to claim 1 or 2, wherein one or more candidate learning models are selected from the candidate learning models.

In addition,
The information processing apparatus according to claim 1, further comprising display control means for displaying the result of object detection processing by the candidate learning model selected by the second selection means.

The display control means has a large score as the result of the object detection process differs greatly among the candidate learning models for each of the plurality of captured images for which the candidate learning model selected by the second selection means has performed the object detection process. Is determined, and for each candidate learning model selected by the second selection means, a GUI including the result of object detection processing by the candidate learning model for a specified number of captured images from the top in descending order of score is displayed. The information processing apparatus according to claim 4.

The GUI includes a selection unit for selecting a candidate learning model.
5. The detection means is characterized in that a candidate learning model corresponding to a selection unit selected according to a user operation in the GUI is used as a selection learning model, and object detection processing is performed using the selection learning model. The information processing device described in.

The detection means uses the candidate learning model having the largest number of object detection processing results selected according to the user operation among the object detection processing results displayed for each candidate learning model by the display control means as the selection learning model. The information processing apparatus according to claim 5, wherein the object detection process is performed using the selective learning model.

In addition,
6. The information processing device described.

The information according to any one of claims 1 to 8, wherein the information includes Exif information of the photographed image, information relating to the field in which the photographed image was photographed, and information relating to an object included in the photographed image. The information processing device described in.

In addition,
The object according to any one of claims 1 to 7, further comprising means for setting a device for photographing and inspecting the appearance of the product based on the detection area of the object obtained as a result of the object detection process. The information processing device described.

The information processing apparatus according to any one of claims 1 to 7, 10, wherein the information includes information relating to an object included in the captured image.

The detection means is characterized in that the object detection process is performed on the captured image of the object by using the candidate learning model selected based on the user operation among the candidate learning models selected by the second selection means. The information processing apparatus according to any one of claims 1 to 11.

It is an information processing method performed by an information processing device.
The first selection means of the information processing apparatus selects one or more learning models as candidate learning models from a plurality of learning models learned in different learning environments based on the information related to the photographing of the object. ,
A second selection step in which the second selection means of the information processing device selects one or more candidate learning models from the candidate learning model based on the result of the object detection process by the candidate learning model selected in the first selection step. When,
A detection step in which the detection means of the information processing apparatus uses at least one of the candidate learning models selected in the second selection step to perform object detection processing on the captured image of the object. An information processing method characterized by being prepared.

A computer program for making a computer function as each means of the information processing apparatus according to any one of claims 1 to 12.