JP7222519B2

JP7222519B2 - Object identification system, model learning system, object identification method, model learning method, program

Info

Publication number: JP7222519B2
Application number: JP2018169183A
Authority: JP
Inventors: 和幸橋本; 三好堀川; 東岡本
Original assignee: Iwate Prefectural University
Current assignee: Iwate Prefectural University
Priority date: 2018-09-10
Filing date: 2018-09-10
Publication date: 2023-02-15
Anticipated expiration: 2038-09-10
Also published as: JP2020042528A

Description

特許法第３０条第２項適用１．オブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムの集会での発表集会名：岩手県立大学ソフトウェア情報学部平成２９年度卒業研究成果発表会開催場所：岩手県立大学ソフトウェア情報学部棟開催日：平成３０年２月９日２．オブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムの刊行物での発表刊行物：情報処理学会第８０回全国大会講演論文集発行者：一般社団法人情報処理学会発行日：平成３０年３月１３日３．オブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムの集会での発表集会名：情報処理学会第８０回全国大会開催場所：早稲田大学西早稲田キャンパス開催日：平成３０年３月１５日Application of Article 30, Paragraph 2 of the Patent Law 1. Presentation of object identification system, model learning system, object identification method, model learning method, program at meeting Name of meeting: Faculty of Software and Information Sciences, Iwate Prefectural University 2017 Graduation Research Results Presentation Venue: Faculty of Software and Information Sciences Building, Iwate Prefectural University Date: February 9, 2018 2. Publication of object identification system, model learning system, object identification method, model learning method, program Publication: Proceedings of the 80th Annual Conference of Information Processing Society of Japan Publisher: Information Processing Society of Japan Publication date: Heisei March 13, 2018 3. Presentation of object identification system, model learning system, object identification method, model learning method, program at meeting Name of meeting: Information Processing Society of Japan 80th National Convention Venue: Waseda University Nishi-Waseda Campus Date: March 15, 2018 Day

本発明は、オブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムに関する。 The present invention relates to an object identification system, model learning system, object identification method, model learning method, and program.

従来、撮像画像に含まれるオブジェクト（例えば、人、動物、物体等）を対象とした画像認識技術が知られている（例えば、特許文献１）。特許文献１に記載された技術では、画像内の画素の画素値の勾配と当該画素の座標との積である勾配モーメントを要素として含む特徴ベクトルを算出する特徴量抽出部と、特徴量抽出部で算出された特徴ベクトルの類似性に基づいて画像を分類する分類部と、を備えており、画像の特徴に基づいて、画像を「車両画像」と「非車両画像」とに分類するように構成されている。 2. Description of the Related Art Conventionally, there is known an image recognition technique targeting objects (for example, people, animals, objects, etc.) included in a captured image (for example, Patent Document 1). In the technique described in Patent Document 1, a feature quantity extraction unit that calculates a feature vector including, as an element, a gradient moment that is the product of the gradient of the pixel value of a pixel in an image and the coordinates of the pixel, and a feature quantity extraction unit: and a classification unit that classifies the images based on the similarity of the feature vectors calculated in the above, and classifies the images into "vehicle images" and "non-vehicle images" based on the image features. It is configured.

特開２０１５－０１１５５２号公報JP 2015-011552 A

従来技術では、画像に含まれるオブジェクトが属するクラス（「車両」又は「非車両」）しか認識（クラス認識）することができず、例えば、同じクラス（例えば、「車両」）に属する複数のオブジェクトの各々の撮像画像が存在する場合に、各撮像画像に含まれるオブジェクト自体を認識（インスタンス認識）することが困難であった。 In the prior art, only the class (“vehicle” or “non-vehicle”) to which the object included in the image belongs can be recognized (class recognition). is present, it is difficult to recognize (instance recognition) the object itself included in each captured image.

本発明は上記課題に鑑みてなされたものであり、撮像画像に含まれるオブジェクト自体を識別することの可能なオブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and aims to provide an object identification system, a model learning system, an object identification method, a model learning method, and a program capable of identifying an object itself included in a captured image. aim.

上記課題を解決するために、第一に本発明は、所定空間内に存在する複数のオブジェクトのうち何れかのオブジェクトの撮像画像を取得する第１取得手段と、前記何れかのオブジェクトを撮像した位置における環境情報を取得する第２取得手段と、前記撮像画像及び前記環境情報を取得した場合に、取得した前記撮像画像及び前記環境情報と、前記撮像画像及び前記環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトが前記複数のオブジェクトのうち何れのオブジェクトであるかを識別する識別手段と、を備えるオブジェクト識別システムを提供する（発明１）。 In order to solve the above problems, firstly, the present invention provides a first acquisition means for acquiring a captured image of any one of a plurality of objects existing in a predetermined space, and a second acquisition means for acquiring environmental information at a position; and when the captured image and the environmental information are acquired, the captured image and the acquired environmental information, and using the captured image and the environmental information as data for learning. and an identification means for identifying which of the plurality of objects the photographed object is based on the learned model based on machine learning (Invention 1) .

かかる発明（発明１）によれば、撮像画像及び環境情報を取得すると、取得した撮像画像及び環境情報と、撮像画像及び環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトが複数のオブジェクトのうち何れのオブジェクトであるかが識別されるので、例えば、オブジェクトの撮像画像のみを用いて当該オブジェクトが属するクラスを識別するのではなく、当該オブジェクトの撮像画像とともに当該オブジェクトの撮像位置における環境情報をさらに用いることによって、撮像画像に含まれるオブジェクト自体を識別することが可能になる。これにより、撮像されたオブジェクトの個体認識性能を向上させることができる。 According to this invention (Invention 1), when the captured image and the environment information are acquired, the acquired captured image and the environment information, and the learned model based on machine learning using the captured image and the environment information as learning data, Therefore, for example, instead of identifying the class to which the object belongs using only the captured image of the object, it is possible to identify the class of the object. By further using environmental information at the imaging position of the object together with the captured image, it becomes possible to identify the object itself included in the captured image. As a result, the individual recognition performance of the imaged object can be improved.

上記発明（発明１）においては、前記第２取得手段は、前記所定空間内の複数の位置に設けられた通信装置と、前記何れかのオブジェクトを撮像した位置に存在する端末装置と、の何れか一方から送信された無線信号を他方が受信したときの受信信号強度を前記環境情報として取得してもよい（発明２）。 In the above invention (invention 1), the second acquisition means includes any one of communication devices provided at a plurality of positions in the predetermined space and a terminal device present at a position where any one of the objects is captured. A received signal strength when a radio signal transmitted from one of them is received by the other may be acquired as the environmental information (Invention 2).

かかる発明（発明２）によれば、複数の通信装置と端末装置との何れか一方から送信された無線信号を他方が受信したときの受信信号強度を、何れかのオブジェクトが撮像されたときの端末装置の位置（つまり、当該何れかのオブジェクトの撮像位置）を表す環境情報として取得することができる。 According to this invention (invention 2), the received signal strength when the radio signal transmitted from one of the plurality of communication devices and the terminal device is received by the other is calculated as the strength of the received signal when any one of the objects is imaged. It can be acquired as environment information representing the position of the terminal device (that is, the imaging position of any of the objects).

上記発明（発明２）においては、前記学習用データとして用いられる環境情報は、各通信装置に対応する受信信号強度が、撮像されたオブジェクト毎及び通信装置毎に異なるガウス分布に従うと仮定して、前記複数のオブジェクトのうち所定のオブジェクトに対応する各通信装置の受信信号強度のガウス分布に従って抽出された各通信装置の受信信号強度の複数の組み合わせを含んでもよい（発明３）。 In the above invention (invention 2), assuming that the environment information used as the learning data follows a Gaussian distribution in which the received signal strength corresponding to each communication device follows a different Gaussian distribution for each captured object and for each communication device, A plurality of combinations of received signal strength of each communication device extracted according to a Gaussian distribution of received signal strength of each communication device corresponding to a predetermined object among the plurality of objects may be included (Invention 3).

かかる発明（発明３）によれば、例えば、複数のオブジェクトのうち所定のオブジェクトが撮像されたと想定して、当該所定のオブジェクトに対応する各通信装置の受信信号強度のガウス分布に従って受信信号強度を通信装置毎に抽出することによって、当該所定のオブジェクトに対応する受信信号強度の複数の組み合わせを容易に生成することができる。これにより、学習用データに含まれる環境情報の量を増やすこと（データオーギュメンテーション）が可能になるので、機械学習を効率良く進めることができる。 According to this invention (Invention 3), for example, assuming that a predetermined object among a plurality of objects is imaged, the received signal strength is calculated according to the Gaussian distribution of the received signal strength of each communication device corresponding to the predetermined object. By extracting for each communication device, it is possible to easily generate a plurality of combinations of received signal strengths corresponding to the predetermined object. As a result, it is possible to increase the amount of environmental information included in the learning data (data augmentation), so machine learning can proceed efficiently.

第二に本発明は、所定空間内に存在する複数のオブジェクトのうち何れかのオブジェクトの撮像画像と、前記何れかのオブジェクトを撮像した位置における環境情報とを取得する取得手段と、取得した前記撮像画像及び前記環境情報を学習用データとして用いた機械学習によって、撮像されたオブジェクトが前記複数のオブジェクトのうち何れのオブジェクトであるかを識別するのに用いられるモデルを学習する学習手段と、を備えるモデル学習システムを提供する（発明４）。 Secondly, the present invention provides an acquisition means for acquiring a captured image of any one of a plurality of objects existing in a predetermined space and environmental information at the position where the one of the objects was captured; learning means for learning a model used to identify which of the plurality of objects the imaged object is by machine learning using the captured image and the environment information as learning data; (Invention 4).

かかる発明（発明４）によれば、撮像画像及び環境情報を学習用データとして用いた機械学習によって、撮像されたオブジェクトが複数のオブジェクトのうち何れのオブジェクトであるかを識別するのに用いられるモデルを学習することが可能になるので、このモデルを用いることによって、撮像画像に含まれるオブジェクト自体を識別することができる。 According to this invention (Invention 4), there is provided a model that is used to identify which of a plurality of objects a captured object is by machine learning using captured images and environment information as learning data. can be learned, the object itself included in the captured image can be identified by using this model.

第三に本発明は、コンピュータに、所定空間内に存在する複数のオブジェクトのうち何れかのオブジェクトの撮像画像を取得するステップと、前記何れかのオブジェクトを撮像した位置における環境情報を取得するステップと、前記撮像画像及び前記環境情報を取得した場合に、取得した前記撮像画像及び前記環境情報と、前記撮像画像及び前記環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトが前記複数のオブジェクトのうち何れのオブジェクトであるかを識別するステップと、の各ステップを実行させる、オブジェクト識別方法を提供する（発明５）。 Thirdly, the present invention provides a computer with a step of acquiring a captured image of any one of a plurality of objects existing in a predetermined space, and a step of acquiring environmental information at the position where the one of the objects is captured. and, when the captured image and the environment information are acquired, the acquired captured image and the environment information, and a learned model based on machine learning using the captured image and the environment information as learning data; Based on this, there is provided an object identification method for identifying which of the plurality of objects the photographed object is (Invention 5).

第四に本発明は、コンピュータに、所定空間内に存在する複数のオブジェクトのうち何れかのオブジェクトの撮像画像と、前記何れかのオブジェクトを撮像した位置における環境情報とを取得するステップと、取得した前記撮像画像及び前記環境情報を学習用データとして用いて、撮像されたオブジェクトが前記複数のオブジェクトのうち何れのオブジェクトであるかを識別するのに用いられるモデルを学習するステップと、の各ステップを実行させる、モデル学習方法を提供する（発明６）。 Fourthly, the present invention provides a computer with a step of obtaining, in a computer, a captured image of one of a plurality of objects existing in a predetermined space and environmental information at the position where the one of the objects was captured; a step of learning a model used to identify which of the plurality of objects the imaged object is, using the captured image and the environment information obtained as learning data; (Invention 6).

第五に本発明は、コンピュータに、所定空間内に存在する複数のオブジェクトのうち何れかのオブジェクトの撮像画像を取得する機能と、前記何れかのオブジェクトを撮像した位置における環境情報を取得する機能と、前記撮像画像及び前記環境情報を取得した場合に、取得した前記撮像画像及び前記環境情報と、前記撮像画像及び前記環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトが前記複数のオブジェクトのうち何れのオブジェクトであるかを識別する機能と、を実現させるためのプログラムを提供する（発明７）。 Fifthly, the present invention provides a computer with a function of acquiring a captured image of any one of a plurality of objects existing in a predetermined space, and a function of acquiring environmental information at the position where the one of the objects is captured. and, when the captured image and the environment information are acquired, the acquired captured image and the environment information, and a learned model based on machine learning using the captured image and the environment information as learning data; Based on this, a program for realizing a function of identifying which of the plurality of objects the photographed object is is provided (Invention 7).

第六に本発明は、コンピュータに、所定空間内に存在する複数のオブジェクトのうち何れかのオブジェクトの撮像画像と、前記何れかのオブジェクトを撮像した位置における環境情報とを取得する機能と、取得した前記撮像画像及び前記環境情報を学習用データとして用いて、撮像されたオブジェクトが前記複数のオブジェクトのうち何れのオブジェクトであるかを識別するのに用いられるモデルを学習する機能と、を実現させるためのプログラムを提供する（発明８）。 Sixthly, the present invention provides a computer with a function of acquiring a captured image of any one of a plurality of objects existing in a predetermined space and environmental information at the position where the one of the objects was captured; a function of learning a model used to identify which of the plurality of objects the imaged object is, using the captured image and the environment information obtained as learning data. (Invention 8).

本発明のオブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムによれば、撮像画像に含まれるオブジェクト自体を識別することができる。 According to the object identification system, model learning system, object identification method, model learning method, and program of the present invention, the object itself included in the captured image can be identified.

本発明の一実施形態に係るオブジェクト識別システム及びモデル学習システムの基本構成を概略的に示す図である。1 is a diagram schematically showing the basic configuration of an object identification system and a model learning system according to one embodiment of the present invention; FIG. 端末装置の構成を示すブロック図である。It is a block diagram which shows the structure of a terminal device. 識別装置の構成を示すブロック図である。It is a block diagram which shows the structure of an identification device. オブジェクト識別システム及びモデル学習システムで主要な役割を果たす機能を説明するための機能ブロック図である。FIG. 3 is a functional block diagram for explaining functions that play major roles in the object identification system and model learning system; 取得データの構成例を示す図である。It is a figure which shows the structural example of acquisition data. 通信装置及び端末装置間の距離と、受信信号強度との関係を示す図である。FIG. 4 is a diagram showing the relationship between the distance between a communication device and a terminal device and received signal strength; 学習用データの構成例を示す図である。It is a figure which shows the structural example of the data for learning. （ａ）～（ｃ）は、機械学習の対象となるモデルの構造の一例を示す図である。(a) to (c) are diagrams showing an example of the structure of a model to be machine-learned. （ａ）～（ｃ）は、撮像対象毎及び通信装置毎の受信信号強度の分布の一例を示す図である。(a) to (c) are diagrams showing an example of the distribution of received signal strength for each imaging target and for each communication device. 本発明の一実施形態に係るモデル学習システムの主要な処理の一例を示すフローチャートである。4 is a flowchart showing an example of main processing of the model learning system according to one embodiment of the present invention; 本発明の一実施形態に係るオブジェクト識別システムの主要な処理の一例を示すフローチャートである。4 is a flow chart showing an example of main processing of the object identification system according to one embodiment of the present invention; オブジェクト識別システム及びモデル学習システムの各機能について、識別装置と、学習装置との間の分担例を示す図である。FIG. 4 is a diagram showing an example of division of functions between an identification device and a learning device for each function of an object identification system and a model learning system;

以下、本発明の一実施形態について添付図面を参照して詳細に説明する。ただし、この実施形態は例示であり、本発明はこれに限定されるものではない。 An embodiment of the present invention will be described in detail below with reference to the accompanying drawings. However, this embodiment is an example, and the present invention is not limited to this.

（１）オブジェクト識別システム及びモデル学習システムの基本構成
図１は、本発明の一実施形態に係るオブジェクト識別システム及びモデル学習システムの基本構成を概略的に示す図である。図１に示すように、本実施形態に係るオブジェクト識別システムは、例えば屋内等の所定空間Ｓ内に設けられた複数（図１の例では、「Ａ」～「Ｄ」と表記された４つ）の通信装置１０の各々と無線通信を行う端末装置２０が空間Ｓ内に存在する場合に、空間Ｓ内に存在する複数（図１の例では、「ａ」～「ｄ」と表記された４つ）のオブジェクト（例えば、ゴミ箱や消火器等の物体等）ＯＢのうち何れかのオブジェクトＯＢを端末装置２０を用いて撮像した撮像画像と、当該何れかのオブジェクトＯＢを撮像したときの端末装置２０の位置における環境情報（本実施形態では、受信信号強度（ＲＳＳＩ））と、を識別装置３０が取得すると、識別装置３０が、取得した撮像画像及び環境情報と、撮像画像及び環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別するようになっている。ここで、端末装置２０と、識別装置３０とは、例えばインターネットやＬＡＮ（Local Area Network）等の通信網ＮＷ（ネットワーク）に接続されている。 (1) Basic Configuration of Object Identification System and Model Learning System FIG. 1 is a diagram schematically showing the basic configuration of an object identification system and a model learning system according to an embodiment of the present invention. As shown in FIG. 1, the object identification system according to the present embodiment includes a plurality of objects (in the example of FIG. ) exists in the space S, a plurality of terminal devices 20 (indicated as “a” to “d” in the example of FIG. 1) existing in the space S 4) objects (for example, an object such as a trash can or a fire extinguisher) OB, a captured image captured by using the terminal device 20, and a terminal device when capturing an image of the object OB. When the identification device 30 acquires environmental information (in this embodiment, received signal strength indicator (RSSI)) at the position of the device 20, the identification device 30 acquires the captured image and the environmental information, and the captured image and the environmental information. Based on the learned model based on machine learning used as learning data, which object OB is the imaged object OB among the plurality of objects OB is identified. Here, the terminal device 20 and the identification device 30 are connected to a communication network NW (network) such as the Internet or a LAN (Local Area Network).

また、本実施形態に係るモデル学習システムでは、識別装置３０が、撮像画像及び環境情報を取得すると、取得した撮像画像及び環境情報を学習用データとして用いた機械学習によって、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別するのに用いられるモデルを学習するようになっている。 Further, in the model learning system according to the present embodiment, when the identification device 30 acquires the captured image and the environment information, the captured object OB is determined by machine learning using the acquired captured image and the environment information as learning data. It learns a model that is used to identify which object OB is among a plurality of objects OB.

本実施形態では、複数のオブジェクトＯＢが同じ種類（クラス）のオブジェクトである場合を一例として説明するが、各オブジェクトＯＢは異なる種類のオブジェクトであってもよい。 In this embodiment, a case where a plurality of objects OB are objects of the same type (class) will be described as an example, but each object OB may be an object of a different type.

各通信装置１０は、空間Ｓ内で無線ＬＡＮ（例えばＷｉ－Ｆｉ（登録商標））を用いて端末装置２０と無線通信を行うことが可能な位置に設けられている。各通信装置１０は、例えば、２つ以上の端末装置２０間の無線通信を中継する装置であってもよいし、端末装置２０と空間Ｓ内に存在する他の端末装置（図示省略）との間の無線通信を中継する装置であってもよいし、端末装置２０と、通信網ＮＷを介して接続された他の装置との間の通信を中継する装置であってもよい。また、各通信装置１０は、パケットキャプチャであってもよい。 Each communication device 10 is provided in a position where it is possible to perform wireless communication with the terminal device 20 using a wireless LAN (eg, Wi-Fi (registered trademark)) within the space S. Each communication device 10 may be, for example, a device that relays wireless communication between two or more terminal devices 20, or may be a device that relays wireless communication between the terminal device 20 and another terminal device (not shown) existing in the space S. It may be a device that relays wireless communication between terminals, or a device that relays communication between the terminal device 20 and another device connected via the communication network NW. Also, each communication device 10 may be a packet capture.

なお、ここでは、Ｗｉ－Ｆｉ（登録商標）を用いて無線通信を行う場合を一例として説明しているが、通信方式は、この場合に限られない。例えば、Ｂｌｕｅｔｏｏｔｈ（登録商標）、ＺｉｇＢｅｅ（登録商標）、ＵＷＢ、光無線通信（例えば赤外線）等の無線通信方式が用いられてもよいし、ＵＳＢ等の有線通信方式が用いられてもよい。 Note that although a case where wireless communication is performed using Wi-Fi (registered trademark) is described as an example here, the communication method is not limited to this case. For example, a wireless communication method such as Bluetooth (registered trademark), ZigBee (registered trademark), UWB, or optical wireless communication (for example, infrared) may be used, or a wired communication method such as USB may be used.

端末装置２０は、空間Ｓ内に存在する場合に、無線ＬＡＮを用いて各通信装置１０と無線通信を行うことができるように構成されている。また、端末装置２０は、各通信装置１０との間で無線通信を行うために、自身の識別情報（例えばＭＡＣアドレス等）を含む無線信号（例えばプローブ要求等）を所定間隔（例えば数秒）で送信するように構成されてもよい。さらに、端末装置２０は、何れかのオブジェクトＯＢを撮像した場合に、撮像位置における環境情報（本実施形態では、無線信号の受信信号強度）を測定するように構成されてもよい。端末装置２０は、例えば、携帯端末、スマートフォン、ＰＤＡ（Personal Digital Assistant）、パーソナルコンピュータ、双方向の通信機能を備えたテレビジョン受像機（いわゆる多機能型のスマートテレビも含む。）等のように、個々のユーザによって操作される端末装置であってもよい。 The terminal device 20 is configured to be able to wirelessly communicate with each communication device 10 using a wireless LAN when present in the space S. In addition, in order to perform wireless communication with each communication device 10, the terminal device 20 transmits a radio signal (eg probe request etc.) including its own identification information (eg MAC address etc.) at predetermined intervals (eg several seconds). may be configured to transmit. Furthermore, the terminal device 20 may be configured to measure environmental information (received signal strength of a wireless signal in this embodiment) at the imaging position when any object OB is imaged. The terminal device 20 is, for example, a mobile terminal, a smart phone, a PDA (Personal Digital Assistant), a personal computer, a television receiver having a two-way communication function (including a so-called multifunctional smart television), or the like. , terminal devices operated by individual users.

識別装置３０は、通信網ＮＷを介して端末装置２０と通信を行い、撮像画像及び環境情報を端末装置２０から取得するように構成されている。なお、識別装置３０は、複数の通信装置１０を介して端末装置２０と通信可能に構成されている場合には、通信網ＮＷを介して端末装置２０と接続されていなくてもよい。識別装置３０は、例えば、汎用のパーソナルコンピュータであってもよい。 The identification device 30 is configured to communicate with the terminal device 20 via the communication network NW and acquire the captured image and environmental information from the terminal device 20 . Note that the identification device 30 may not be connected to the terminal device 20 via the communication network NW when it is configured to be able to communicate with the terminal device 20 via a plurality of communication devices 10 . The identification device 30 may be, for example, a general-purpose personal computer.

（２）端末装置の構成
図２を参照して端末装置２０の構成について説明する。図２は、端末装置２０の内部構成を示すブロック図である。図２に示すように、端末装置２０は、ＣＰＵ（Central Processing Unit）２１と、ＲＯＭ（Read Only Memory）２２と、ＲＡＭ（Random Access Memory）２３と、記憶装置２４と、表示処理部２５と、表示部２６と、入力部２７と、撮像部２８と、通信インタフェース部２９と、を備えており、各部間の制御信号又はデータ信号を伝送するためのバス２０ａが設けられている。 (2) Configuration of Terminal Device The configuration of the terminal device 20 will be described with reference to FIG. FIG. 2 is a block diagram showing the internal configuration of the terminal device 20. As shown in FIG. As shown in FIG. 2, the terminal device 20 includes a CPU (Central Processing Unit) 21, a ROM (Read Only Memory) 22, a RAM (Random Access Memory) 23, a storage device 24, a display processing unit 25, A display unit 26, an input unit 27, an imaging unit 28, and a communication interface unit 29 are provided, and a bus 20a is provided for transmitting control signals or data signals between the units.

ＣＰＵ２１は、電源が端末装置２０に投入されると、ＲＯＭ２２又は記憶装置２４に記憶された各種のプログラムをＲＡＭ２３にロードして実行する。また、ＣＰＵ２１は、複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢを含む画像が撮像部２８を用いて撮像される毎に、撮像画像と、撮像位置における環境情報とを、通信インタフェース部２９を介して端末装置２０に送信するように構成されている。 When the terminal device 20 is powered on, the CPU 21 loads various programs stored in the ROM 22 or the storage device 24 into the RAM 23 and executes them. In addition, every time an image including one of the plurality of objects OB is captured using the imaging unit 28, the CPU 21 transmits the captured image and environmental information at the imaging position via the communication interface unit 29. is configured to transmit to the terminal device 20.

記憶装置２４は、例えば、フラッシュメモリ、ＳＳＤ（Solid State Drive）、磁気記憶装置（例えばＨＤＤ（Hard Disk Drive）、フロッピーディスク（登録商標）、磁気テープ等）、光ディスク等の不揮発性の記憶装置であってもよいし、ＲＡＭ等の揮発性の記憶装置であってもよく、ＣＰＵ２１が実行するプログラムやＣＰＵ２１が参照するデータを格納する。 The storage device 24 is, for example, a non-volatile storage device such as flash memory, SSD (Solid State Drive), magnetic storage device (eg, HDD (Hard Disk Drive), floppy disk (registered trademark), magnetic tape, etc.), optical disk, or the like. Alternatively, it may be a volatile storage device such as a RAM, which stores programs executed by the CPU 21 and data referred to by the CPU 21 .

表示処理部２５は、ＣＰＵ２１から与えられる表示用データを表示部２６に表示する。表示部２６は、例えば、マトリクス状に画素単位で配置された薄膜トランジスタを含むＬＣＤ（Liquid Crystal Display）モニタであり、表示用データに基づいて薄膜トランジスタを駆動することで、表示されるデータを表示画面に表示する。 The display processing unit 25 displays the display data provided from the CPU 21 on the display unit 26 . The display unit 26 is, for example, an LCD (Liquid Crystal Display) monitor including thin film transistors arranged in a matrix on a pixel-by-pixel basis. indicate.

端末装置２０が釦入力方式の通信端末である場合には、入力部２７は、ユーザの操作入力を受け入れるための方向指示釦及び決定釦等の複数の指示入力釦を含む釦群と、テンキー等の複数の指示入力釦を含む釦群とを備え、各釦の押下（操作）入力を認識してＣＰＵ２１へ出力するためのインタフェース回路を含む。 If the terminal device 20 is a button input type communication terminal, the input unit 27 includes a group of buttons including a plurality of instruction input buttons such as a direction instruction button and a decision button for accepting user's operation input, and a numeric keypad. and an interface circuit for recognizing the pressing (operation) input of each button and outputting it to the CPU 21 .

端末装置２０がタッチパネル入力方式の通信端末である場合には、入力部２７は、主として表示画面に指先又はペンで触れることによるタッチパネル方式の入力を受け付ける。タッチパネル入力方式は、静電容量方式等の公知の方式でよい。 If the terminal device 20 is a touch panel input type communication terminal, the input unit 27 mainly accepts touch panel type input by touching the display screen with a fingertip or a pen. The touch panel input method may be a known method such as a capacitive method.

また、端末装置２０が音声入力可能な端末装置である場合には、入力部２７は、音声入力用のマイクを含むように構成されてもよいし、外付けのマイクを介して入力された音声データをＣＰＵ２１へ出力するためのインタフェース回路を備えてもよい。 Further, when the terminal device 20 is a terminal device capable of voice input, the input unit 27 may be configured to include a microphone for voice input, or may be configured to include a voice input via an external microphone. An interface circuit for outputting data to the CPU 21 may be provided.

撮像部２８は、動画像及び／又は静止画像を撮像する撮像装置（例えば、デジタルカメラやデジタルビデオカメラ等）であってもよく、所定の撮像指示が入力部２７を用いて入力されると撮像処理を行い、撮像画像を例えばＲＡＭ２３又は記憶装置２４に記憶するように構成されている。なお、撮像部２８は、端末装置２０に内蔵されていてもよいし、端末装置２０の外部に設けられていてもよい。 The imaging unit 28 may be an imaging device (for example, a digital camera, a digital video camera, etc.) that captures moving images and/or still images. It is configured to perform processing and store the captured image in the RAM 23 or the storage device 24, for example. Note that the imaging unit 28 may be built in the terminal device 20 or may be provided outside the terminal device 20 .

通信インタフェース部２９は、各通信装置１０と無線通信を行うためのインタフェース回路と、通信網ＮＷを介して通信を行うためのインタフェース回路と、を含む。また、通信インタフェース部２９には、各通信装置１０から送信された無線信号を受信したときの受信信号強度を検出するＲＳＳＩ回路が設けられている。 The communication interface unit 29 includes an interface circuit for wireless communication with each communication device 10 and an interface circuit for communication via the communication network NW. Further, the communication interface unit 29 is provided with an RSSI circuit for detecting the received signal strength when receiving the radio signal transmitted from each communication device 10 .

なお、各通信装置１０から送信された無線信号には、無線信号を送信した通信装置１０の識別情報（例えば、ＭＡＣ（Media Access Control）アドレス等）が含まれていてもよく、通信装置１０から送信された無線信号の受信信号強度がＲＳＳＩ回路によって検出されると、検出された受信信号強度と当該通信装置１０の識別情報とが互いに対応付けられた状態で例えばＲＡＭ２３又は記憶装置２４に記憶されてもよい。 Note that the radio signal transmitted from each communication device 10 may include identification information (eg, MAC (Media Access Control) address, etc.) of the communication device 10 that transmitted the radio signal. When the received signal strength of the transmitted radio signal is detected by the RSSI circuit, the detected received signal strength and the identification information of the communication device 10 are associated with each other and stored in the RAM 23 or the storage device 24, for example. may

（３）識別装置の構成
図３を参照して識別装置３０の構成について説明する。図３は、識別装置３０の内部構成を示すブロック図である。図３に示すように、識別装置３０は、ＣＰＵ３１と、ＲＯＭ３２と、ＲＡＭ３３と、記憶装置３４と、表示処理部３５と、表示部３６と、入力部３７と、通信インタフェース部３８と、を備えており、各部間の制御信号又はデータ信号を伝送するためのバス３０ａが設けられている。 (3) Configuration of Identification Device The configuration of the identification device 30 will be described with reference to FIG. FIG. 3 is a block diagram showing the internal configuration of the identification device 30. As shown in FIG. As shown in FIG. 3, the identification device 30 includes a CPU 31, a ROM 32, a RAM 33, a storage device 34, a display processing section 35, a display section 36, an input section 37, and a communication interface section 38. A bus 30a is provided for transmitting control signals or data signals between the units.

ＣＰＵ３１は、電源が識別装置３０に投入されると、ＲＯＭ３２又は記憶装置３４に記憶された各種のプログラムをＲＡＭ３３にロードして実行する。本実施形態では、ＣＰＵ３１は、ＲＯＭ３２又は記憶装置３４に記憶されたプログラムを読み出して実行することにより、後述する第１取得手段４１、第２取得手段４２、識別手段４３、取得手段４４及び学習手段４５（図４に示す）の機能を実現する。 When the identification device 30 is powered on, the CPU 31 loads various programs stored in the ROM 32 or the storage device 34 into the RAM 33 and executes them. In this embodiment, the CPU 31 reads out and executes a program stored in the ROM 32 or the storage device 34 to obtain first acquisition means 41, second acquisition means 42, identification means 43, acquisition means 44, and learning means, which will be described later. 45 (shown in FIG. 4).

記憶装置３４は、例えば、フラッシュメモリ、ＳＳＤ、磁気記憶装置（例えばＨＤＤ、フロッピーディスク（登録商標）、磁気テープ等）、光ディスク等の不揮発性の記憶装置であってもよいし、ＲＡＭ等の揮発性の記憶装置であってもよく、ＣＰＵ３１が実行するプログラムやＣＰＵ３１が参照するデータを格納する。また、記憶装置３４には、後述する取得データ（図５に示す）及び学習用データ（図７に示す）が記憶されている。 The storage device 34 may be, for example, a nonvolatile storage device such as a flash memory, an SSD, a magnetic storage device (eg, HDD, floppy disk (registered trademark), magnetic tape, etc.), an optical disk, or a volatile storage device such as a RAM. It may be a physical storage device, and stores programs executed by the CPU 31 and data referred to by the CPU 31 . Acquisition data (shown in FIG. 5) and learning data (shown in FIG. 7), which will be described later, are stored in the storage device 34 .

入力部３７は、例えばマウスやキーボード等の情報入力デバイスであってもよいし、音声入力用のマイクを含むように構成されてもよいし、画像入力用のデジタルカメラやデジタルビデオカメラを含むように構成されてもよいし、外付けのデジタルカメラやデジタルビデオカメラで撮像された画像データを受け付けてＣＰＵ３１へ出力するためのインタフェース回路を備えてもよい。 The input unit 37 may be, for example, an information input device such as a mouse or keyboard, may be configured to include a microphone for voice input, or may include a digital camera or digital video camera for image input. or an interface circuit for receiving image data captured by an external digital camera or digital video camera and outputting the data to the CPU 31 .

通信インタフェース部３８は、通信網ＮＷを介して通信を行うためのインタフェース回路を含む。識別装置３０内の他の各部の詳細は、端末装置２０と同様であってもよい。 The communication interface unit 38 includes an interface circuit for communicating via the communication network NW. Details of other units in the identification device 30 may be the same as those in the terminal device 20 .

（４）オブジェクト識別システム及びモデル学習システムにおける各機能の概要
本実施形態のオブジェクト識別システム及びモデル学習システムで実現される機能について、図４を参照して説明する。図４は、本実施形態のオブジェクト識別システム及びモデル学習システムで主要な役割を果たす機能を説明するための機能ブロック図である。図４の機能ブロック図では、第１取得手段４１、第２取得手段４２及び識別手段４３が本発明のオブジェクト識別システムの主要な構成に対応しており、取得手段４４及び学習手段４５が本発明のモデル学習システムの主要な構成に対応している。 (4) Outline of Each Function in Object Identification System and Model Learning System Functions implemented in the object identification system and model learning system of the present embodiment will be described with reference to FIG. FIG. 4 is a functional block diagram for explaining functions that play major roles in the object identification system and model learning system of this embodiment. In the functional block diagram of FIG. 4, the first acquisition means 41, the second acquisition means 42 and the identification means 43 correspond to the main components of the object identification system of the present invention, and the acquisition means 44 and the learning means 45 correspond to the main components of the object identification system of the present invention. It corresponds to the main configuration of the model learning system of

第１取得手段４１は、所定空間Ｓ内に存在する複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢの撮像画像を取得する機能を備える。 The first acquisition means 41 has a function of acquiring a captured image of any one of a plurality of objects OB present in the predetermined space S. FIG.

第１取得手段４１の機能は、例えば以下のように実現される。先ず、端末装置２０のＣＰＵ２１は、複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢ（例えば、「ａ」のオブジェクトＯＢ）が撮像部２８の撮像範囲内に存在する場合に、所定の撮像指示が入力部２７を用いて入力されると、撮像部２８に対して撮像処理を行わせる。撮像部２８によって撮像された画像（ここでは、「ａ」のオブジェクトＯＢの撮像画像）のデータは、例えばＲＡＭ２３又は記憶装置２４に記憶される。そして、ＣＰＵ２１は、例えばＲＡＭ２３又は記憶装置２４に記憶された撮像画像のデータを、通信インタフェース部２９及び通信網ＮＷを介して識別装置３０に送信する。 The functions of the first acquisition means 41 are realized, for example, as follows. First, the CPU 21 of the terminal device 20 receives a predetermined imaging instruction when any object OB among a plurality of objects OB (for example, the object OB of "a") exists within the imaging range of the imaging unit 28. When input using the unit 27, the imaging unit 28 is caused to perform imaging processing. The data of the image captured by the imaging unit 28 (here, the captured image of the object OB of “a”) is stored in the RAM 23 or the storage device 24, for example. Then, the CPU 21 transmits the captured image data stored in, for example, the RAM 23 or the storage device 24 to the identification device 30 via the communication interface unit 29 and the communication network NW.

一方、識別装置３０のＣＰＵ３１は、撮像画像のデータを、通信インタフェース部３８を介して受信（取得）すると、受信した撮像画像のデータを例えばＲＡＭ３３又は記憶装置３４に記憶する。なお、ＣＰＵ３１は、撮像画像が動画で構成されている場合には、撮像画像をいくつかのフレームに分割して、分割された各フレームを撮像画像として例えばＲＡＭ３３又は記憶装置３４に記憶してもよい。このようにして、空間Ｓ内に存在する複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢ（ここでは、「ａ」のオブジェクトＯＢ）の撮像画像を取得することができる。 On the other hand, when the CPU 31 of the identification device 30 receives (acquires) the captured image data via the communication interface unit 38, the received captured image data is stored in the RAM 33 or the storage device 34, for example. Note that when the captured image is composed of a moving image, the CPU 31 may divide the captured image into several frames and store each divided frame as a captured image in the RAM 33 or the storage device 34, for example. good. In this way, it is possible to obtain a captured image of one of the plurality of objects OB present in the space S (here, the object OB of "a").

第２取得手段４２は、複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢを撮像した位置における環境情報を取得する機能を備える。 The second acquisition means 42 has a function of acquiring environment information at a position where any one of the plurality of objects OB is imaged.

また、第２取得手段４２は、所定空間Ｓ内の複数の位置に設けられた通信装置１０と、何れかのオブジェクトＯＢを撮像した位置に存在する端末装置２０と、の何れか一方から送信された無線信号を他方が受信したときの受信信号強度を環境情報として取得してもよい。これにより、複数の通信装置１０と端末装置２０との何れか一方から送信された無線信号を他方が受信したときの受信信号強度を、何れかのオブジェクトＯＢが撮像されたときの端末装置２０の位置（つまり、当該何れかのオブジェクトＯＢの撮像位置）を表す環境情報として取得することができる。 In addition, the second acquisition unit 42 receives data transmitted from either one of the communication devices 10 provided at a plurality of positions within the predetermined space S and the terminal device 20 present at the position where any object OB is imaged. The received signal strength when the other side receives the radio signal received by the other side may be acquired as the environment information. As a result, the received signal strength when one of the plurality of communication devices 10 and the terminal device 20 receives a radio signal transmitted from the other is calculated as the received signal strength of the terminal device 20 when any object OB is imaged. It can be acquired as environment information representing the position (that is, the imaging position of any object OB).

ここで、第２取得手段４２によって取得される受信信号強度は、例えば、受信信号強度の値そのものであってもよいし、受信信号強度の値を所定の計算式に代入することによって得られた値であってもよいし、受信信号強度の度合いを表す情報であってもよい。 Here, the received signal strength obtained by the second obtaining means 42 may be, for example, the value of the received signal strength itself, or may be obtained by substituting the value of the received signal strength into a predetermined formula. It may be a value, or it may be information representing the degree of received signal strength.

第２取得手段４２の機能は、例えば以下のように実現される。なお、ここでは、第２取得手段４２が、各通信装置１０から送信された無線信号を端末装置２０が受信したときの受信信号強度を、複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢを撮像した位置における環境情報として取得する場合を一例として説明する。先ず、端末装置２０のＣＰＵ２１は、各通信装置１０から送信された無線信号を受信する毎に、ＲＳＳＩ回路によって検出された当該無線信号の受信信号強度の値を例えばＲＡＭ２３又は記憶装置２４に記憶する。また、ＣＰＵ２１は、上述したように撮像部２８に対して撮像処理を行わせた場合に、例えばＲＡＭ２３又は記憶装置２４に記憶された受信信号強度の値のうち最新の受信信号強度の値を複数の通信装置１０毎に抽出し、抽出した受信信号強度の値を、通信インタフェース部２９及び通信網ＮＷを介して識別装置３０に送信する。ここで、ＣＰＵ２１は、抽出した受信信号強度の値を、撮像画像とともに送信してもよい。 The function of the second obtaining means 42 is realized, for example, as follows. Here, the second acquisition unit 42 obtains the strength of the received signal when the terminal device 20 receives the wireless signal transmitted from each communication device 10 by capturing an image of one of the plurality of objects OB. A case of acquiring environment information at a position will be described as an example. First, every time the CPU 21 of the terminal device 20 receives a radio signal transmitted from each communication device 10, the received signal strength value of the radio signal detected by the RSSI circuit is stored in the RAM 23 or the storage device 24, for example. . Further, when causing the imaging unit 28 to perform imaging processing as described above, the CPU 21 selects a plurality of latest received signal strength values among the received signal strength values stored in the RAM 23 or the storage device 24, for example. is extracted for each communication device 10, and the value of the extracted received signal strength is transmitted to the identification device 30 via the communication interface unit 29 and the communication network NW. Here, the CPU 21 may transmit the value of the extracted received signal strength together with the captured image.

一方、識別装置３０のＣＰＵ３１は、各通信装置１０に対応する受信信号強度の値を通信インタフェース部３８を介して受信（取得）すると、第１取得手段４１の機能に基づいて取得した撮像画像のデータと、各通信装置１０に対応する受信信号強度の値と、を対応付けた状態で例えば図５に示す取得データに記憶する。取得データは、撮像画像毎に、各通信装置１０（図の例では、通信装置Ａ～Ｄ）に対応する受信信号強度の値が対応付けられた状態で記述されているデータである。このようにして、各通信装置１０から送信された無線信号を端末装置２０が受信したときの受信信号強度を、複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢを撮像した位置における環境情報として取得することができる。 On the other hand, when the CPU 31 of the identification device 30 receives (acquires) the value of the received signal intensity corresponding to each communication device 10 via the communication interface unit 38, the captured image acquired based on the function of the first acquisition means 41 The data and the value of the received signal strength corresponding to each communication device 10 are stored in the acquired data shown in FIG. 5, for example, in a state of being associated with each other. Acquired data is data described in a state in which values of received signal strength corresponding to each communication device 10 (communication devices A to D in the example of the drawing) are associated with each captured image. In this way, the received signal strength when the terminal device 20 receives the radio signal transmitted from each communication device 10 is acquired as environmental information at the position where any one of the plurality of objects OB is imaged. be able to.

なお、通信装置１０から送信された無線信号を端末装置２０が受信したときの受信信号強度に基づいて、通信装置１０と端末装置２０との距離をもとめることができる。具体的に説明すると、１つの通信装置１０と端末装置２０との距離は、例えば、以下の式（１）及び（２）を用いることによって算出することができる。
Ｐ_ｒ＝Ｐ_ｔ＋Ｇ_ｒ＋Ｇ_ｔ－Ｌ …（１）

式（１）中、Ｐ_ｒは受信信号強度（ｄＢｍ）を示し、Ｐ_ｔは電波発信装置（ここでは、通信装置１０）の送信電力（ｄＢｍ）を示し、Ｇ_ｒは端末装置２０の受信アンテナの利得（ｄＢｉ）を示し、Ｇ_ｔは通信装置１０の送信アンテナの利得（ｄＢｉ）を示し、Ｌは自由空間損失（ｄＢｍ）を示している。このＬは式（２）でもとめられ、式（２）中、ｄは通信装置１０と端末装置２０との距離（ｍ）を示し、ｆは電波の周波数（Ｈｚ）を示し、ｃは光速（＝２．９９７９２４５８×１０^８）（ｍ／ｓ）を示している。式（１）及び式（２）によってもとめられる距離と受信信号強度の値との関係は、例えば図６に示す対数関数で表される。図６に示すように、通信装置１０と端末装置２０との距離が短いほど、受信信号強度の値が大きいことがわかる。 The distance between the communication device 10 and the terminal device 20 can be calculated based on the received signal strength when the terminal device 20 receives the radio signal transmitted from the communication device 10 . Specifically, the distance between one communication device 10 and the terminal device 20 can be calculated using, for example, the following equations (1) and (2).
P _r =P _t +G _r +G _t −L (1)

In equation (1), P _r indicates the received signal strength (dBm), P _t indicates the transmission power (dBm) of the radio wave transmitting device (here, the communication device 10), and G _r indicates the receiving antenna of the terminal device 20. _Gt denotes the gain (dBi) of the transmission antenna of the communication device 10, and L denotes the free space loss (dBm). This L is determined by the equation (2), in which d indicates the distance (m) between the communication device 10 and the terminal device 20, f indicates the frequency of radio waves (Hz), and c indicates the speed of light ( =2.99792458×10 ⁸ ) (m/s). The relationship between the distance and the value of the received signal strength determined by equations (1) and (2) is represented by a logarithmic function shown in FIG. 6, for example. As shown in FIG. 6, the shorter the distance between the communication device 10 and the terminal device 20, the greater the value of the received signal strength.

このようにして、何れかのオブジェクトＯＢが撮像されたときの複数の通信装置１０の各々と端末装置２０との距離をもとめることによって、当該何れかのオブジェクトＯＢが撮像されたときの空間Ｓ内の端末装置２０の位置（つまり、当該何れかのオブジェクトＯＢの撮像位置）を推定することができる。なお、空間Ｓ内に設けられる通信装置１０の数が３つ以上の場合には、空間Ｓ内の所定の平面上の端末装置２０の位置を推定することができ、通信装置１０の数が４つ以上の場合には、空間Ｓ内の端末装置２０の３次元の位置を推定することができる。 In this way, by obtaining the distance between each of the plurality of communication devices 10 and the terminal device 20 when any object OB is imaged, the distance in the space S when any object OB is imaged can be determined. , the position of the terminal device 20 (that is, the imaging position of any object OB) can be estimated. Note that when the number of communication devices 10 provided in the space S is three or more, the position of the terminal device 20 on a predetermined plane in the space S can be estimated, and the number of communication devices 10 is four. In the case of one or more, the three-dimensional position of the terminal device 20 in the space S can be estimated.

識別手段４３は、撮像画像及び環境情報を取得した場合に、取得した撮像画像及び環境情報と、撮像画像及び環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別する機能を備える。 When the captured image and the environment information are acquired, the identification means 43 acquires the captured image and the environment information, and the learned model based on machine learning using the captured image and the environment information as learning data. It has a function of identifying which of a plurality of objects OB the imaged object OB is.

識別手段４３の機能は、例えば以下のように実現される。識別装置３０のＣＰＵ３１は、第１取得手段４１及び第２取得手段４２の機能に基づいて撮像画像及び環境情報を取得すると、取得した撮像画像及び環境情報を、撮像画像及び環境情報を学習用データとして用いた機械学習に基づく学習済モデル（後述する）に入力することによって、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別する。学習用データの一例を図７に示す。図７に示す学習用データは、撮像画像毎に、撮像位置における環境情報（図７の例では、撮像位置における各通信装置１０（図の例では、通信装置Ａ～Ｄ）に対応する受信信号強度の値）と、（正解データである）撮像されたオブジェクトＯＢ（図７の例では、「ａ」、「ｂ」、「ｃ」及び「ｄ」の何れか）と、が対応付けられた状態で記述されているデータである。これにより、機械学習の結果として、撮像画像及び環境情報（ここでは、各通信装置１０に対応する受信信号強度）と、撮像されたオブジェクトＯＢとの関係を示す学習済モデルが構成される。 The function of the identification means 43 is realized, for example, as follows. When the CPU 31 of the identification device 30 acquires the captured image and the environment information based on the functions of the first acquisition unit 41 and the second acquisition unit 42, the CPU 31 converts the acquired captured image and the environment information into learning data. By inputting to a learned model (described later) based on machine learning used as , it is identified which object OB the imaged object OB is among a plurality of objects OB. An example of learning data is shown in FIG. The learning data shown in FIG. 7 is a received signal corresponding to environmental information at the imaging position (in the example in FIG. 7, each communication device 10 in the imaging position (communication devices A to D in the example in the figure) at the imaging position for each captured image. intensity value) and the imaged object OB (which is correct data) (one of “a”, “b”, “c” and “d” in the example of FIG. 7) are associated with each other. It is the data described in the state. Thus, as a result of machine learning, a learned model representing the relationship between the captured image and the environment information (here, received signal strength corresponding to each communication device 10) and the captured object OB is constructed.

なお、ＣＰＵ３１は、撮像されたオブジェクトＯＢが何れのオブジェクトＯＢ（例えば、「ａ」のオブジェクトＯＢ）であるかを識別すると、例えば、識別されたオブジェクトＯＢ（ここでは、「ａ」のオブジェクトＯＢ）に関する情報を提示してもよいし（例えば、表示部３６に表示することであってもよいし、スピーカ等の音声出力装置から出力すること等であってもよい）、第１取得手段４１及び第２取得手段４２の機能に基づいて取得した撮像画像及び環境情報と、識別されたオブジェクトＯＢに関する情報とを対応付けた状態で例えばＲＡＭ３３又は記憶装置３４に記憶してもよい。 Note that, when the CPU 31 identifies which object OB (for example, the object OB of "a") the imaged object OB is, for example, the identified object OB (here, the object OB of "a") (For example, it may be displayed on the display unit 36, or may be output from an audio output device such as a speaker), the first acquisition means 41 and The captured image and the environment information acquired based on the function of the second acquisition means 42 and the information related to the identified object OB may be stored in the RAM 33 or the storage device 34 in association with each other.

取得手段４４は、所定空間Ｓ内に存在する複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢの撮像画像と、当該何れかのオブジェクトを撮像した位置における環境情報とを取得する機能を備える。 Acquisition means 44 has a function of acquiring a captured image of any object OB among a plurality of objects OB existing in the predetermined space S, and environmental information at the position where the object was captured.

取得手段４４の機能は、例えば以下のように実現される。なお、ここでは、取得手段４４が、各通信装置１０から送信された無線信号を端末装置２０が受信したときの受信信号強度を、何れかのオブジェクトＯＢを撮像した位置における環境情報として取得する場合を一例として説明する。識別装置３０のＣＰＵ３１は、第１取得手段４１及び第２取得手段４２の機能と同様に、撮像画像のデータと、各通信装置１０に対応する受信信号強度の値とを通信インタフェース部３８を介して端末装置２０から受信（取得）すると、受信した撮像画像のデータと、各通信装置１０に対応する受信信号強度の値と、を対応付けた状態で例えば図７に示す学習用データに記憶する。また、ＣＰＵ３１は、学習用データに記憶された撮像画像毎に、複数のオブジェクトＯＢ（図７の例では、「ａ」、「ｂ」、「ｃ」及び「ｄ」）のうち何れのオブジェクトＯＢが撮像されたかを示す正解データが入力部３７を用いて入力されると、入力された正解データを、撮像画像に対応する「オブジェクト」の項目に記憶する。 The function of the acquisition means 44 is realized, for example, as follows. Here, in the case where the acquiring unit 44 acquires the received signal strength when the terminal device 20 receives the radio signal transmitted from each communication device 10 as the environmental information at the position where any object OB is imaged. will be described as an example. The CPU 31 of the identification device 30 receives the captured image data and the received signal strength values corresponding to the communication devices 10 via the communication interface unit 38, similarly to the functions of the first acquisition unit 41 and the second acquisition unit 42. is received (acquired) from the terminal device 20, the data of the received captured image and the value of the received signal strength corresponding to each communication device 10 are stored in the learning data shown in FIG. . Further, the CPU 31 selects which object OB among a plurality of objects OB (“a”, “b”, “c” and “d” in the example of FIG. 7) for each captured image stored in the learning data. is input using the input unit 37, the input correct data is stored in the item of "object" corresponding to the captured image.

学習手段４５は、取得した撮像画像及び環境情報を学習用データとして用いた機械学習によって、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別するのに用いられるモデルを学習する機能を備える。 The learning means 45 is a model used for identifying which of a plurality of objects OB the imaged object OB is by machine learning using the acquired captured image and environment information as learning data. Equipped with a function to learn

学習手段４５の機能は、例えば以下のように実現される。識別装置３０のＣＰＵ３１は、例えば、所定のモデル学習指示が入力部３７を用いて入力されると、図７に示す学習用データを用いてモデルの学習を行う。ここで、ＣＰＵ３１は、図８（ａ）～（ｃ）に示す３種類のモデルのうち何れかのモデルを用いて学習してもよい。各モデルについて以下に説明する。 The function of the learning means 45 is realized, for example, as follows. For example, when a predetermined model learning instruction is input using the input unit 37, the CPU 31 of the identification device 30 learns the model using the learning data shown in FIG. Here, the CPU 31 may learn using any one of the three types of models shown in FIGS. 8(a) to 8(c). Each model is described below.

図８（ａ）に示すモデルは、先ず、入力層で複数の環境情報（ここでは、複数の通信装置１０の各々に対応する受信信号強度）を画像と同じ縦横比に整形する。これにより、複数の環境情報と画像の特徴を同時に抽出することができると考えられる。そして、このモデルは、整形後の各環境情報と画像とを連結（concatenate）した後のテンソルを学習するようになっている。なお、図８（ａ）には３つの画像が示されているが、これらは、撮像画像のＲＧＢの各々のピクセルからなるテンソルである。ここで、画像を表すテンソルをｘ_ｉｊｋ（ｉは縦の画素数、ｊは横の画素数、ｋはチャンネル数）とし、環境情報を表すテンソルをｓ_ｌ（ｌは環境情報を測定する装置の数（ここでは、端末装置２０の数））とし、ｘ_ｉｊｋ及びｓ_ｌの各々を操作して得られるテンソルをＸとし、ｆ，ｆ´をニューラルネットワークとすると、入力に対して各クラス（ここでは、「ａ」、「ｂ」、「ｃ」及び「ｄ」の各オブジェクト）に属する確率ｆ_θ（ｘ_ｉｊｋ，ｓ_ｌ）を、以下の式（３）及び（４）を用いることによって算出することができる。

式（３）において、Ｐ_ｉｊはｘ_ｉｊｋと同じ縦横比であって全ての要素が１のテンソルであり、式（３）において、｜｜は各テンソルの連結を表す。θは、確率的勾配降下法を用いて決定される。入力層で連結を行うことから、本実施形態では、このモデルをｍＣＮＮ－ｃ（multimodal Convolutional Neural Network concatenate）と称することとする。 In the model shown in FIG. 8A, first, the input layer shapes a plurality of environmental information (in this case, the received signal strength corresponding to each of the plurality of communication devices 10) to the same aspect ratio as the image. It is considered that this makes it possible to simultaneously extract a plurality of environmental information and image features. The model then learns the tensor after concatenating each environmental information and the image after shaping. Although three images are shown in FIG. 8(a), these are tensors made up of respective pixels of RGB of the captured image. Here, a tensor representing an image is x _ijk (i is the number of vertical pixels, j is the number of horizontal pixels, k is the number of channels), and a tensor representing environmental information is s _l (l is the number (here, the number of terminal devices 20)), X is the tensor obtained by manipulating each of _xijk and _sl , and f and f' are neural networks. Then, the probability f _θ (x _ijk , _sl ) belonging to each object “a”, “b”, “c” and “d”) is calculated by using the following equations (3) and (4) can do.

In equation (3), P _ij is a tensor with the same aspect ratio as x _ijk and all elements are 1, where || represents the concatenation of each tensor. θ is determined using stochastic gradient descent. Since concatenation is performed in the input layer, this model is called mCNN-c (multimodal convolutional neural network concatenate) in this embodiment.

図８（ｂ）に示すモデルは、入力層で複数の環境情報（ここでは、複数の通信装置１０の各々に対応する受信信号強度）に重みを付け、画像との和を学習するようになっている。これにより、各環境情報の影響度を表現することができると考えられる。ｓ_ｌに対する重みをｗ_ｌとした場合、入力に対して各クラス（ここでは、「ａ」、「ｂ」、「ｃ」及び「ｄ」の各オブジェクト）に属する確率ｆ_θ（ｘ_ｉｊｋ，ｓ_ｌ）を、以下の式（５）及び（６）を用いることによって算出することができる。

θ及びｗは、確率的勾配降下法を用いて決定される。入力層で重みを付けることから、本実施形態では、このモデルをｍＣＮＮ－ｗ（multimodal Convolutional Neural Network weighted）と称することとする。 In the model shown in FIG. 8(b), the input layer weights a plurality of environmental information (in this case, the received signal strength corresponding to each of the plurality of communication devices 10) and learns the sum with the image. ing. It is considered that this makes it possible to express the degree of influence of each piece of environmental information. If the weight for s _l is w _l , the probability f _θ (x _ijk , s _l ) can be calculated by using equations (5) and (6) below.

θ and w are determined using stochastic gradient descent. Since the input layer is weighted, this model is called mCNN-w (multimodal convolutional neural network weighted) in this embodiment.

一般に、畳み込みニューラルネットワーク（ＣＮＮ）では、出力層の前でテンソルを変形してランクを１とし、全結合層を用いる。図８（ｃ）に示すモデルは、画像に対してＣＮＮを用いた後にランク１にしたテンソルと、環境情報に対して複数の全結合層を重ねたランク１のテンソルとを連結し、再び全結合層を重ねるようになっている。これにより、画像の特徴がＣＮＮによって抽出可能であり、環境情報の特徴がＦＮＮ（Forward Neural Network）によって抽出可能であると考えられる。中間層をｆ´´，ｆ´´´とした場合、入力に対して各クラス（ここでは、「ａ」、「ｂ」、「ｃ」及び「ｄ」の各オブジェクト）に属する確率ｆ_θ（ｘ，ｓ）を、以下の式（７）を用いることによって算出することができる。

θは、確率的勾配降下法を用いて決定される。全結合層の前の連結を行うことから、本実施形態では、このモデルをｍＣＮＮ－ｆ（multimodal Convolutional Neural Network fully-connected）と称することとする。なお、発明者は、上述した３つのモデルのうちｍＣＮＮ－ｆを用いた場合に、少ない学習回数で高精度の識別結果が得られることを見出した。 In general, convolutional neural networks (CNNs) transform the tensor to rank 1 before the output layer and use fully connected layers. The model shown in FIG. 8(c) concatenates a rank-1 tensor after using CNN for the image and a rank-1 tensor obtained by stacking multiple fully-connected layers for the environment information, and Bonding layers are stacked. Thus, it is conceivable that image features can be extracted by CNN, and environmental information features can be extracted by FNN (Forward Neural Network). When the intermediate layers are f'' and f'', the probability f _θ ( x, s) can be calculated by using equation (7) below.

θ is determined using stochastic gradient descent. In this embodiment, we refer to this model as mCNN-f (multimodal Convolutional Neural Network fully-connected), since it performs the concatenation before the fully connected layer. The inventors have found that, among the three models described above, when mCNN-f is used, highly accurate identification results can be obtained with a small number of times of learning.

なお、学習用データとして用いられる環境情報は、各通信装置１０に対応する受信信号強度が、撮像されたオブジェクトＯＢ毎及び通信装置１０毎に異なるガウス分布に従うと仮定して、複数のオブジェクトＯＢのうち所定のオブジェクトＯＢに対応する各通信装置１０の受信信号強度のガウス分布に従って抽出された各通信装置１０の受信信号強度の複数の組み合わせを含んでもよい。これにより、例えば、複数のオブジェクトＯＢのうち所定のオブジェクトＯＢが撮像されたと想定して、当該所定のオブジェクトＯＢに対応する各通信装置１０の受信信号強度のガウス分布に従って受信信号強度を通信装置１０毎に抽出することによって、当該所定のオブジェクトＯＢに対応する受信信号強度の複数の組み合わせを容易に生成することができる。これにより、学習用データに含まれる環境情報の量を増やすこと（データオーギュメンテーション）が可能になるので、機械学習を効率良く進めることができる。 It should be noted that the environment information used as learning data is assumed to follow a Gaussian distribution in which the received signal intensity corresponding to each communication device 10 is different for each object OB captured and for each communication device 10. A plurality of combinations of the received signal strength of each communication device 10 extracted according to the Gaussian distribution of the received signal strength of each communication device 10 corresponding to a predetermined object OB may be included. As a result, for example, assuming that a predetermined object OB among a plurality of objects OB is imaged, the received signal strength is calculated by the communication device 10 according to the Gaussian distribution of the received signal strength of each communication device 10 corresponding to the predetermined object OB. A plurality of combinations of received signal strengths corresponding to the predetermined object OB can be easily generated by extracting each of them. As a result, it is possible to increase the amount of environmental information included in the learning data (data augmentation), so machine learning can proceed efficiently.

この場合における取得手段４４又は学習手段４５の機能は、例えば以下のように実現される。識別装置３０のＣＰＵ３１は、例えば、所定のタイミング毎（例えば、所定時間が経過する毎であってもよいし、撮像画像及び環境情報を所定数記憶する毎等であってもよい）に、学習用データを用いて、以下に示す環境情報（ここでは、各通信装置１０に対応する受信信号強度）のデータオーギュメンテーション処理を行ってもよい。 The functions of the acquisition means 44 or the learning means 45 in this case are realized, for example, as follows. The CPU 31 of the identification device 30 performs learning at, for example, each predetermined timing (for example, each time a predetermined time elapses, or each time a predetermined number of captured images and environmental information are stored). Data augmentation processing may be performed on the following environment information (here, received signal strength corresponding to each communication device 10) using the data for the environment.

ここで、複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢを撮像したときの各通信装置１０に対応する受信信号強度の分布の一例を図９（ａ）～（ｃ）に示す。図９（ａ）は、「ａ」のオブジェクトＯＢが撮像された場合の通信装置Ａに対応する受信信号強度の分布の一例を示しており、図９（ｂ）は、「ｂ」のオブジェクトＯＢが撮像された場合の通信装置Ａに対応する受信信号強度の分布の一例を示しており、図９（ｃ）は、「ａ」のオブジェクトＯＢが撮像された場合の通信装置Ｂに対応する受信信号強度の分布の一例を示している。これらの分布は、例えば学習用データを用いて生成され得る。 FIGS. 9A to 9C show an example of distribution of received signal strength corresponding to each communication device 10 when an object OB out of a plurality of objects OB is imaged. FIG. 9(a) shows an example of the distribution of the received signal intensity corresponding to the communication device A when the object OB of "a" is imaged, and FIG. 9(b) shows the distribution of the object OB of "b". 9C shows an example of the distribution of the received signal intensity corresponding to the communication device A when is imaged, and FIG. 9C shows the reception corresponding to the communication device B when the object OB of "a" is imaged An example of signal intensity distribution is shown. These distributions can be generated using training data, for example.

図９（ａ）～（ｃ）に示すように、各々の分布は、受信信号強度の平均値を略中心として左右対称な釣鐘型の分布（ガウス分布）に近似している。したがって、各通信装置１０に対応する受信信号強度が、撮像されたオブジェクトＯＢ毎（「ａ」、「ｂ」、「ｃ」及び「ｄ」のオブジェクトＯＢ毎）及び通信装置毎（通信装置Ａ～Ｄ毎）に異なるガウス分布に従うと仮定することができる。そこで、識別装置３０のＣＰＵ３１は、環境情報のデータオーギュメンテーション処理として、複数のオブジェクトＯＢのうち所定のオブジェクトＯＢ（例えば、「ａ」のオブジェクトＯＢ）が撮像されたときの各通信装置１０（通信装置Ａ～Ｄ）の受信信号強度のガウス分布に従って、当該所定のオブジェクトＯＢに対応する各通信装置１０（通信装置Ａ～Ｄ）の受信信号強度の複数の組み合わせを抽出してもよい。そして、ＣＰＵ３１は、抽出した各通信装置１０（通信装置Ａ～Ｄ）の受信信号強度の複数の組み合わせを、所定のオブジェクトＯＢ（ここでは、「ａ」のオブジェクトＯＢ）に対応付けた状態で学習用データに記憶してもよい。 As shown in FIGS. 9A to 9C, each distribution approximates a symmetrical bell-shaped distribution (Gaussian distribution) about the average value of received signal strength. Therefore, the received signal strength corresponding to each communication device 10 is determined for each imaged object OB (“a”, “b”, “c”, and “d” object OB) and for each communication device (communication device A to D) can be assumed to follow different Gaussian distributions. Therefore, the CPU 31 of the identification device 30 performs data augmentation processing of the environment information when a predetermined object OB (for example, the object OB of "a") among the plurality of objects OB is imaged. A plurality of combinations of received signal strengths of the communication devices 10 (communication devices A to D) corresponding to the predetermined object OB may be extracted according to the Gaussian distribution of the received signal strengths of the communication devices A to D). Then, the CPU 31 learns a plurality of combinations of the extracted received signal strengths of the communication devices 10 (communication devices A to D) while associating them with a predetermined object OB (here, the object OB of "a"). may be stored in the data for use.

（５）本実施形態のモデル学習システムの主要な処理のフロー
次に、本実施形態のモデル学習システムにより行われる主要な処理のフローの一例について、図１０のフローチャートを参照して説明する。 (5) Flow of Main Processing of Model Learning System of this Embodiment Next, an example of flow of main processing performed by the model learning system of this embodiment will be described with reference to the flowchart of FIG. 10 .

先ず、端末装置２０のＣＰＵ２１は、複数のオブジェクトＯＢのうち何れかのオブジェクトＯＢ（例えば、「ａ」のオブジェクトＯＢ）が撮像部２８の撮像範囲内に存在する場合に、所定の撮像指示が入力部２７を用いて入力されると、撮像部２８に対して撮像処理を行わせる。撮像部２８によって撮像された画像（ここでは、「ａ」のオブジェクトＯＢの撮像画像）のデータは、例えばＲＡＭ２３又は記憶装置２４に記憶される。また、端末装置２０のＣＰＵ２１は、各通信装置１０から送信された無線信号を受信する毎に、ＲＳＳＩ回路によって検出された当該無線信号の受信信号強度の値を例えばＲＡＭ２３又は記憶装置２４に記憶する。そして、ＣＰＵ２１は、撮像部２８に対して撮像処理を行わせた場合に、例えばＲＡＭ２３又は記憶装置２４に記憶された受信信号強度の値のうち最新の受信信号強度の値を複数の通信装置１０毎に抽出し、抽出した受信信号強度の値と、例えばＲＡＭ２３又は記憶装置２４に記憶された撮像画像のデータとを、通信インタフェース部２９及び通信網ＮＷを介して識別装置３０に送信する。 First, the CPU 21 of the terminal device 20 receives a predetermined imaging instruction when any object OB among a plurality of objects OB (for example, the object OB of "a") exists within the imaging range of the imaging unit 28. When input using the unit 27, the imaging unit 28 is caused to perform imaging processing. The data of the image captured by the imaging unit 28 (here, the captured image of the object OB of “a”) is stored in the RAM 23 or the storage device 24, for example. Also, each time the CPU 21 of the terminal device 20 receives a radio signal transmitted from each communication device 10, the received signal strength value of the radio signal detected by the RSSI circuit is stored in the RAM 23 or the storage device 24, for example. . Then, when causing the imaging unit 28 to perform imaging processing, the CPU 21 sends the latest received signal strength value among the received signal strength values stored in the RAM 23 or the storage device 24 to the plurality of communication devices 10, for example. The value of the received signal strength extracted for each time and the captured image data stored in, for example, the RAM 23 or the storage device 24 are transmitted to the identification device 30 via the communication interface unit 29 and the communication network NW.

一方、識別装置３０のＣＰＵ３１は、何れかのオブジェクトＯＢの撮像画像のデータと、撮像した位置における環境情報（ここでは、各通信装置１０に対応する受信信号強度の値）とを通信インタフェース部３８を介して端末装置２０から受信（取得）すると（ステップＳ１００）、受信した撮像画像のデータと、各通信装置１０に対応する受信信号強度の値と、を対応付けた状態で学習用データに記憶する。また、ＣＰＵ３１は、学習用データに記憶された撮像画像毎に、複数のオブジェクトＯＢのうち何れのオブジェクトＯＢが撮像されたかを示す正解データが入力部３７を用いて入力されると、入力された正解データを、撮像画像に対応する「オブジェクト」の項目に記憶する。 On the other hand, the CPU 31 of the identification device 30 transmits the captured image data of any object OB and the environment information (here, the received signal strength value corresponding to each communication device 10) at the captured position to the communication interface unit 38. When received (acquired) from the terminal device 20 via (step S100), the data of the received captured image and the value of the received signal strength corresponding to each communication device 10 are stored in the learning data in a state of being associated with each other. do. In addition, when correct data indicating which object OB among a plurality of objects OB was captured is input using the input unit 37 for each captured image stored in the learning data, the CPU 31 receives the input data. The correct answer data is stored in the item of "object" corresponding to the captured image.

次に、識別装置３０のＣＰＵ３１は、取得した撮像画像及び環境情報を学習用データとして用いた機械学習によって、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別するのに用いられるモデルを学習する（ステップＳ１０２）。具体的に説明すると、ＣＰＵ３１は、例えば、所定のモデル学習指示が入力部３７を用いて入力されると、学習用データを用いてモデルの学習を行う。ここで、ＣＰＵ３１は、上述したｍＣＮＮ－ｃ、ｍＣＮＮ－ｗ、ｍＣＮＮ－ｆの３つのモデルのうち何れかのモデルを用いて学習してもよい。 Next, the CPU 31 of the identification device 30 identifies which of the plurality of objects OB the imaged object OB is by machine learning using the acquired captured image and environment information as learning data. (step S102). Specifically, for example, when a predetermined model learning instruction is input using the input unit 37, the CPU 31 performs model learning using learning data. Here, the CPU 31 may learn using any one of the three models of mCNN-c, mCNN-w, and mCNN-f described above.

このようにして、撮像画像及び環境情報を学習用データとして用いた機械学習によって、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別するのに用いられるモデルを学習することが可能になる。 In this way, by machine learning using the captured image and the environment information as learning data, a model used for identifying which of the plurality of objects OB the imaged object OB is is learned. it becomes possible to

（６）本実施形態のオブジェクト識別システムの主要な処理のフロー
次に、本実施形態のオブジェクト識別システムにより行われる主要な処理のフローの一例について、図１１のフローチャートを参照して説明する。 (6) Flow of Main Processing of Object Identification System of this Embodiment Next, an example of flow of main processing performed by the object identification system of this embodiment will be described with reference to the flowchart of FIG.

先ず、識別装置３０のＣＰＵ３１は、端末装置２０から送信された撮像画像のデータを、通信インタフェース部３８を介して受信（取得）すると（ステップＳ２００）、受信した撮像画像のデータを例えばＲＡＭ３３又は記憶装置３４に記憶する。 First, when the CPU 31 of the identification device 30 receives (acquires) the data of the captured image transmitted from the terminal device 20 via the communication interface unit 38 (step S200), the received data of the captured image is stored in the RAM 33 or the like. Store in device 34 .

次に、識別装置３０のＣＰＵ３１は、端末装置２０から送信された各通信装置１０に対応する受信信号強度の値を通信インタフェース部３８を介して受信（取得）すると（ステップＳ２０２）、ステップＳ２００の処理において取得した撮像画像のデータと、各通信装置１０に対応する受信信号強度の値と、を対応付けた状態で取得データに記憶する。 Next, when the CPU 31 of the identification device 30 receives (obtains) the value of the received signal strength corresponding to each communication device 10 transmitted from the terminal device 20 via the communication interface unit 38 (step S202), The captured image data acquired in the process and the received signal intensity value corresponding to each communication device 10 are stored in acquired data in a state of being associated with each other.

そして、識別装置３０のＣＰＵ３１は、撮像画像及び環境情報を取得した場合に、ステップＳ２００及びステップＳ２０２の処理において取得した撮像画像及び環境情報と、撮像画像及び環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別する（ステップＳ２０４）。 Then, when the captured image and the environment information are acquired, the CPU 31 of the identification device 30 acquires the captured image and the environment information acquired in the processing of steps S200 and S202, and the machine using the captured image and the environment information as learning data. Based on the learned model based on the learning, which object OB is the imaged object OB among the plurality of objects OB is identified (step S204).

このようにして、撮像画像及び環境情報が取得されると、取得された撮像画像及び環境情報と、撮像画像及び環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別することが可能になる。 When the captured image and the environment information are acquired in this way, based on the acquired captured image and the environment information, and the learned model based on machine learning using the captured image and the environment information as learning data, , it becomes possible to identify which object OB the photographed object OB is among a plurality of objects OB.

上述したように、本実施形態のオブジェクト識別システム、オブジェクト識別方法、プログラムによれば、撮像画像及び環境情報を取得すると、取得した撮像画像及び環境情報と、撮像画像及び環境情報を学習用データとして用いた機械学習に基づく学習済モデルと、に基づいて、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかが識別されるので、例えば、オブジェクトＯＢの撮像画像のみを用いて当該オブジェクトＯＢが属するクラスを識別するのではなく、当該オブジェクトＯＢの撮像画像とともに当該オブジェクトＯＢの撮像位置における環境情報をさらに用いることによって、撮像画像に含まれるオブジェクトＯＢ自体を識別することが可能になる。これにより、撮像されたオブジェクトＯＢの個体認識性能を向上させることができる。 As described above, according to the object identification system, object identification method, and program of the present embodiment, when a captured image and environment information are acquired, the acquired captured image and environment information and the captured image and environment information are used as learning data. Based on the learned model based on the used machine learning, which object OB is the imaged object OB among a plurality of objects OB is identified, for example, only the captured image of the object OB Instead of identifying the class to which the object OB belongs, it is possible to identify the object OB itself contained in the captured image by further using environmental information at the imaging position of the object OB together with the captured image of the object OB. become. Thereby, the individual recognition performance of the imaged object OB can be improved.

また、本実施形態のモデル学習システム、モデル学習方法、プログラムによれば、撮像画像及び環境情報を学習用データとして用いた機械学習によって、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別するのに用いられるモデルを学習することが可能になるので、このモデルを用いることによって、撮像画像に含まれるオブジェクトＯＢ自体を識別することができる。 Further, according to the model learning system, the model learning method, and the program of the present embodiment, machine learning using captured images and environment information as learning data causes the imaged object OB to Since it becomes possible to learn a model that is used to identify whether an object is an OB or not, the object OB itself included in the captured image can be identified by using this model.

以下、上述した実施形態の変形例について説明する。
（変形例１）
上記実施形態では、第２取得手段４２が、各通信装置１０から送信された無線信号を端末装置２０が受信したときの受信信号強度を取得する場合を一例として説明したが、この場合に限られない。例えば、第２取得手段４２は、端末装置２０から送信された無線信号を各通信装置１０が受信したときの受信信号強度を取得してもよい。ここで、各通信装置１０には、端末装置２０から送信された無線信号を受信したときの受信信号強度を検出するＲＳＳＩ回路が設けられていてもよい。 Modifications of the above-described embodiment will be described below.
(Modification 1)
In the above embodiment, the case where the second acquisition unit 42 acquires the received signal strength when the terminal device 20 receives the radio signal transmitted from each communication device 10 has been described as an example, but this is not the only case. do not have. For example, the second acquisition unit 42 may acquire the received signal strength when each communication device 10 receives the radio signal transmitted from the terminal device 20 . Here, each communication device 10 may be provided with an RSSI circuit that detects the received signal strength when receiving the radio signal transmitted from the terminal device 20 .

この場合、端末装置２０のＣＰＵ２１は、各通信装置１０に対して、無線信号の受信信号強度の値を端末装置２０に送信するように要求してもよい。そして、識別装置３０のＣＰＵ３１は、第２取得手段４２の機能として、端末装置２０から送信された各通信装置１０の受信信号強度の値を通信インタフェース部３８を介して受信（取得）すると、受信した情報を取得データに記憶してもよい。 In this case, the CPU 21 of the terminal device 20 may request each communication device 10 to transmit the received signal strength value of the radio signal to the terminal device 20 . Then, the CPU 31 of the identification device 30 receives (acquires) the value of the received signal strength of each communication device 10 transmitted from the terminal device 20 via the communication interface unit 38 as a function of the second acquisition means 42, and receives The obtained information may be stored in the acquired data.

このように、本変形例にかかるオブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムによれば、上述した実施形態と同様の作用効果を発揮することが可能である。 Thus, according to the object identification system, the model learning system, the object identification method, the model learning method, and the program according to this modified example, it is possible to exhibit the same effect as the above-described embodiment.

（変形例２）
上記実施形態では、第２取得手段４２が、空間Ｓ内の複数の位置に設けられた通信装置１０と、何れかのオブジェクトＯＢを撮像した位置に存在する端末装置２０と、の何れか一方から送信された無線信号を他方が受信したときの受信信号強度を環境情報として取得する場合を一例として説明したが、この場合に限られない。例えば、第２取得手段４２は、何れかのオブジェクトＯＢを撮像した位置における温度、湿度、気圧、照度、地磁気、緯度、経度、高度等を環境情報として取得してもよい。この場合、端末装置２０には、例えば、何れかのオブジェクトＯＢが撮像されたときに測定を行う温度センサ、湿度センサ、気圧センサ、照度センサ、地磁気センサ、ＧＰＳ（Global Positioning System）センサ等のセンサ装置が設けられていてもよい。 (Modification 2)
In the above embodiment, the second acquisition unit 42 receives the Although the case where the received signal strength when the other party receives the transmitted radio signal is acquired as the environment information has been described as an example, the present invention is not limited to this case. For example, the second acquisition means 42 may acquire temperature, humidity, atmospheric pressure, illuminance, geomagnetism, latitude, longitude, altitude, etc. at the position where any object OB is imaged as environmental information. In this case, the terminal device 20 includes sensors such as a temperature sensor, a humidity sensor, an atmospheric pressure sensor, an illuminance sensor, a geomagnetic sensor, a GPS (Global Positioning System) sensor, etc. that perform measurement when any object OB is imaged. A device may be provided.

なお、本発明のプログラムは、コンピュータで読み取り可能な記憶媒体に記憶されていてもよい。このプログラムを記録した記憶媒体は、図２に示された端末装置２０のＲＯＭ２２、ＲＡＭ２３又は記憶装置２４であってもよいし、図３に示された識別装置３０のＲＯＭ３２、ＲＡＭ３３又は記憶装置３４であってもよい。また、例えばＣＤ－ＲＯＭドライブ等のプログラム読取装置に挿入されることで読み取り可能なＣＤ－ＲＯＭ等であってもよい。さらに、記憶媒体は、磁気テープ、カセットテープ、フレキシブルディスク、ＭＯ／ＭＤ／ＤＶＤ等であってもよいし、半導体メモリであってもよい。 The program of the present invention may be stored in a computer-readable storage medium. The storage medium storing this program may be the ROM 22, RAM 23 or storage device 24 of the terminal device 20 shown in FIG. 2, or the ROM 32, RAM 33 or storage device 34 of the identification device 30 shown in may be Alternatively, it may be a CD-ROM or the like that can be read by being inserted into a program reader such as a CD-ROM drive. Further, the storage medium may be magnetic tape, cassette tape, flexible disk, MO/MD/DVD, etc., or semiconductor memory.

以上説明した実施形態及び変形例は、本発明の理解を容易にするために記載されたものであって、本発明を限定するために記載されたものではない。したがって、上記実施形態及び変形例に開示された各要素は、本発明の技術的範囲に属する全ての設計変更や均等物をも含む趣旨である。 The embodiments and modifications described above are described to facilitate understanding of the present invention, and are not described to limit the present invention. Therefore, each element disclosed in the above embodiments and modifications is meant to include all design changes and equivalents that fall within the technical scope of the present invention.

例えば、上述した実施形態では、環境情報のデータオーギュメンテーションを行う場合を一例として説明したが、撮像画像のデータオーギュメンテーションを行ってもよい。この場合、撮像画像に対して回転、クリッピング、左右反転等を行うことにより、学習用データに含まれる画像のデータ量を増やしてもよい。 For example, in the above-described embodiment, the case of performing data augmentation of environmental information has been described as an example, but data augmentation of captured images may be performed. In this case, the data amount of the image included in the learning data may be increased by performing rotation, clipping, horizontal reversal, or the like on the captured image.

また、上述した実施形態では、識別装置３０によって、第１取得手段４１、第２取得手段４２、識別手段４３、取得手段４４及び学習手段４５の各機能を実現する構成としたが、この構成に限られない。例えば、インターネットやＬＡＮ等の通信網を介して識別装置３０と通信可能に接続されたコンピュータ等（例えば、汎用のパーソナルコンピュータやサーバコンピュータ等）から構成された学習装置５０（図１２に示す）であって、撮像されたオブジェクトＯＢが複数のオブジェクトＯＢのうち何れのオブジェクトＯＢであるかを識別するのに用いられるモデルを学習するための学習装置５０が設けられてもよい。この場合、識別装置３０及び学習装置５０は、実質的に同一のハードウェア構成を採ることができるので、上記実施形態において説明した各手段４１～４５のうち少なくとも１つの手段の機能を学習装置５０によって実現することが可能になる。例えば、図４に示した機能ブロック図の各機能は、図１２（ａ），（ｂ）に示すように、識別装置３０と学習装置５０との間で任意に分担されてもよい。 In the above-described embodiment, the identification device 30 is configured to realize the functions of the first acquisition means 41, the second acquisition means 42, the identification means 43, the acquisition means 44, and the learning means 45. Not limited. For example, a learning device 50 (shown in FIG. 12) configured by a computer or the like (for example, a general-purpose personal computer, a server computer, or the like) communicably connected to the identification device 30 via a communication network such as the Internet or a LAN A learning device 50 may be provided for learning a model used to identify which of the plurality of objects OB the imaged object OB is. In this case, the identification device 30 and the learning device 50 can have substantially the same hardware configuration. can be realized by For example, each function of the functional block diagram shown in FIG. 4 may be arbitrarily shared between the identification device 30 and the learning device 50 as shown in FIGS. 12(a) and 12(b).

上述したような本発明のオブジェクト識別システム、モデル学習システム、オブジェクト識別方法、モデル学習方法、プログラムは、撮像画像に含まれるオブジェクト自体を識別することができ、例えば、画像認識システム等に好適に利用することができるので、その産業上の利用可能性は極めて大きい。 The object identification system, the model learning system, the object identification method, the model learning method, and the program of the present invention as described above can identify the object itself included in the captured image, and are suitable for use in, for example, an image recognition system. Since it can be used, its industrial applicability is extremely large.

１０…通信装置
２０…端末装置
３０…識別装置
４１…第１取得手段
４２…第２取得手段
４３…識別手段
４４…取得手段
４５…学習手段
５０…学習装置
ＯＢ…オブジェクト
Ｓ…空間 DESCRIPTION OF SYMBOLS 10...Communication device 20...Terminal device 30...Identification device 41...First acquisition means 42...Second acquisition means 43...Identification means 44...Acquisition means 45...Learning means 50...Learning device OB...Object S...Space

Claims

a first obtaining means for obtaining a captured image of any one of a plurality of objects present in a predetermined space;
a second acquiring means for acquiring environment information at a position where any one of the objects is imaged;
When the captured image and the environment information are acquired, based on the acquired captured image and the environment information, and a trained model based on machine learning using the captured image and the environment information as learning data an identification means for identifying which of the plurality of objects the photographed object is;
with
The second acquisition means acquires wireless signals transmitted from one of communication devices provided at a plurality of positions within the predetermined space and a terminal device present at a position where the object is imaged. An object identification system that acquires a received signal strength when received by the other as the environment information .

The environment information used as the learning data is the predetermined object among the plurality of objects, assuming that the received signal strength corresponding to each communication device follows a Gaussian distribution that differs for each captured object and for each communication device. 2. The object identification system of claim 1 , comprising a plurality of combinations of received signal strengths of each communication device sampled according to a Gaussian distribution of received signal strengths of each communication device corresponding to .

Acquisition means for acquiring a captured image of any one of a plurality of objects existing in a predetermined space and environmental information at the position where the one of the objects was captured;
learning means for learning a model used for identifying which of the plurality of objects the imaged object is by machine learning using the acquired captured image and the environment information as learning data; and,
with
The acquisition means acquires a radio signal transmitted from one of communication devices provided at a plurality of positions within the predetermined space and a terminal device present at a position where an image of one of the objects is captured, from the other. A model learning system that acquires received signal strength at the time of reception as the environment information .

to the computer,
acquiring a captured image of any one of a plurality of objects existing within a predetermined space;
a step of acquiring environmental information at a position where any one of the objects is imaged;
When the captured image and the environment information are acquired, based on the acquired captured image and the environment information, and a trained model based on machine learning using the captured image and the environment information as learning data , identifying which of the plurality of objects the imaged object is;
to execute each step of
The step of acquiring the environment information includes wirelessly transmitted from one of a communication device provided at a plurality of positions in the predetermined space and a terminal device present at a position where the one of the objects is captured. A method for identifying an object , comprising the step of obtaining, as the environment information, a received signal strength when the signal is received by the other party .

to the computer,
acquiring a captured image of any one of a plurality of objects present in a predetermined space and environmental information at the position where the one of the objects was captured;
a step of learning a model that is used to identify which of the plurality of objects the imaged object is, using the captured image and the environment information that have been acquired as learning data;
to execute each step of
The step of acquiring the captured image and the environment information includes communication devices provided at a plurality of positions within the predetermined space, or terminal devices present at positions where the object is captured. a model learning method comprising the step of acquiring, as the environment information, a received signal strength when the other receives a radio signal transmitted from the other .

to the computer,
a function of acquiring a captured image of any one of a plurality of objects existing within a predetermined space;
a function of acquiring environmental information at a position where any of the objects is imaged;
When the captured image and the environment information are acquired, based on the acquired captured image and the environment information, and a trained model based on machine learning using the captured image and the environment information as learning data , a function of identifying which of the plurality of objects the imaged object is;
A program for realizing
The function of acquiring the environmental information is performed by radio transmission from either one of communication devices provided at a plurality of positions within the predetermined space and a terminal device present at a position where the object is imaged. A program including a function of acquiring a received signal strength when the other party receives the signal as the environment information .

to the computer,
a function of obtaining a captured image of any one of a plurality of objects present in a predetermined space and environmental information at the position where the one of the objects was captured;
a function of learning a model that is used to identify which of the plurality of objects the imaged object is, using the captured image and the environment information that have been acquired as learning data;
A program for realizing
The function of acquiring the captured image and the environment information is provided by either communication devices provided at a plurality of positions within the predetermined space, or a terminal device present at a position where any one of the objects is captured. A program including a function of acquiring, as the environment information, a received signal strength when the other receives a radio signal transmitted from the other .