JP2021086271A

JP2021086271A - Information processing apparatus, information processing system, information processing method, and program

Info

Publication number: JP2021086271A
Application number: JP2019213207A
Authority: JP
Inventors: 敦夫野本; Atsuo Nomoto; 佐藤　博; Hiroshi Sato; 博佐藤; 山本　貴久; Takahisa Yamamoto; 貴久山本; 八代　哲; Satoru Yashiro; 哲八代; 英生野呂; Hideo Noro; 俊亮中野; Toshiaki Nakano; 孝嗣牧田; Takatsugu Makita; 将由山▲崎▼; Masayoshi Yamazaki
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-11-26
Filing date: 2019-11-26
Publication date: 2021-06-03

Abstract

To make it easier to detect a subject again, which has been detected once, in the system that detects the subject to be detected.SOLUTION: An information processing apparatus comprises an image acquisition unit for acquiring an image, a detection unit for detecting a subject to be detected from the image, a confirmation result acquisition unit for acquiring a result of visual confirmation by a user for the detection result by the detection unit, and a detection condition change unit for changing a detection condition when the detection unit detects the subject to be detected, and changes the detection conditions for detecting the subject to be detected based on the result of visual confirmation acquired by the confirmation result acquisition unit.SELECTED DRAWING: Figure 3

Description

本発明は、画像から特定の被写体を検知する情報処理装置、情報処理システム、情報処理方法、及びプログラムに関する。 The present invention relates to an information processing device, an information processing system, an information processing method, and a program for detecting a specific subject from an image.

画像に写る人物の判定を行う人物認証技術が知られている。人物認証技術の応用として、監視カメラを使った要注意人物検知、重要顧客検知、迷子検知等のシステムがある。これらのシステムでは、カメラは様々な場所に設置されるため、照明条件や顔向き等によって人物の写り方も様々であり、人物認証の難易度が高い。また、認証結果が誤っている可能性もあるため、実運用上は、装置による認証結果を人の目で確認することもある。例えば、要注意人物検知であれば、事前に登録した要注意人物が装置で検知されたら、ユーザは検知結果が正しいかを目視で確認し、正しければ、その人物への声掛け等を行う。また、特許文献１には、装置により顧客を検知し、人が目視確認した結果を基に検知結果を修正する技術が提案されている。 A person authentication technique for determining a person in an image is known. Applications of person authentication technology include systems for detecting people requiring attention, detecting important customers, and detecting lost children using surveillance cameras. In these systems, since the cameras are installed in various places, the way in which a person is photographed varies depending on the lighting conditions, face orientation, etc., and the difficulty of person authentication is high. In addition, since the authentication result may be incorrect, the authentication result by the device may be visually confirmed in actual operation. For example, in the case of detecting a person requiring attention, when a person requiring attention registered in advance is detected by the device, the user visually confirms whether the detection result is correct, and if it is correct, calls out to that person. Further, Patent Document 1 proposes a technique of detecting a customer by an apparatus and correcting the detection result based on the result of visual confirmation by a person.

特開２０１２−１４１７０８号公報Japanese Unexamined Patent Publication No. 2012-141708

しかしながら、特許文献１に記載の技術は、検知した人物の位置を把握できていることが前提であり、見失った場合には対応することができない。検知した人物に接触しようとしても混雑等ですぐには接触できないこともあり、見失ってしまった場合には、せっかく検知した結果が無駄になる可能性がある。カメラが複数の場所に設置してある場合、他のカメラによって再度検知できればよいが、照明条件や顔向き等の撮影条件が検知したカメラとは異なり、それができる保証はない。本発明は、このような事情に鑑みてなされたものであり、一度検知した被写体を再度検知できやすくすることを目的とする。 However, the technique described in Patent Document 1 is premised on being able to grasp the position of the detected person, and cannot deal with the case where the person is lost. Even if you try to contact the detected person, you may not be able to contact it immediately due to congestion, etc., and if you lose sight of it, the detected result may be wasted. If the cameras are installed in multiple locations, it would be good if they could be detected again by another camera, but unlike cameras that have detected shooting conditions such as lighting conditions and face orientation, there is no guarantee that this will be possible. The present invention has been made in view of such circumstances, and an object of the present invention is to make it easier to detect a subject once detected again.

本発明に係る情報処理装置は、画像を取得する画像取得手段と、前記画像から検知対象の被写体を検知する検知手段と、前記検知手段による検知結果のユーザによる目視確認の結果を取得する確認結果取得手段と、前記確認結果取得手段により取得された前記目視確認の結果に基づいて、前記検知手段が前記検知対象の被写体を検知する際の検知条件を変更する検知条件変更手段とを有することを特徴とする。 The information processing apparatus according to the present invention has an image acquisition means for acquiring an image, a detection means for detecting a subject to be detected from the image, and a confirmation result for acquiring the result of visual confirmation by the user of the detection result by the detection means. Having the acquisition means and the detection condition changing means for changing the detection condition when the detection means detects the subject to be detected based on the result of the visual confirmation acquired by the confirmation result acquisition means. It is a feature.

本発明によれば、一度検知した被写体を再度検知できる可能性が高まり、システムの利便性が向上する。 According to the present invention, the possibility that a once detected subject can be detected again increases, and the convenience of the system is improved.

第１の実施形態における情報処理システムの構成例を示す図である。It is a figure which shows the structural example of the information processing system in 1st Embodiment. 第１の実施形態における情報処理装置のハードウェア構成例を示す図である。It is a figure which shows the hardware configuration example of the information processing apparatus in 1st Embodiment. 第１の実施形態における情報処理装置の機能構成例を示す図である。It is a figure which shows the functional structure example of the information processing apparatus in 1st Embodiment. 第１の実施形態における検知部の機能構成例を示す図である。It is a figure which shows the functional structure example of the detection part in 1st Embodiment. 第１の実施形態における検知処理の例を示すフローチャートである。It is a flowchart which shows the example of the detection process in 1st Embodiment. 第１の実施形態における対象物検知処理の例を示すフローチャートである。It is a flowchart which shows the example of the object detection processing in 1st Embodiment. 第１の実施形態における条件変更処理の例を示すフローチャートである。It is a flowchart which shows the example of the condition change process in 1st Embodiment. 第１の実施形態における検知結果の表示例を説明する図である。It is a figure explaining the display example of the detection result in 1st Embodiment. 第１の実施形態における辞書登録処理の例を示すフローチャートである。It is a flowchart which shows the example of the dictionary registration process in 1st Embodiment. 第２の実施形態における検知結果の表示例を説明する図である。It is a figure explaining the display example of the detection result in 2nd Embodiment. 撮影された顔画像と類似度の推移を説明する図である。It is a figure explaining the transition of the degree of similarity with the photographed face image.

以下、本発明の実施形態を図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

＜第１の実施形態＞
本発明の第１の実施形態について説明する。以下では、複数のカメラを含む検知システムにおいて、装置により検知した要注意人物等の検知対象の人物を警備員等のユーザが目視で確認し、その人物を再度検知しやすいよう検知条件を変更する例について説明する。あるカメラにより撮影された画像から装置が検知した人物をユーザが目視で確認し、検知対象の人物であると確認されたら、そのカメラ及び他のカメラにより撮影された画像から再度検知しやすくなるように、その人物の検知に係る検知条件を変更する。なお、ユーザは携帯端末を所持しており、携帯端末を介して装置による検知結果を取得して目視確認するものとする。 <First Embodiment>
The first embodiment of the present invention will be described. In the following, in a detection system including a plurality of cameras, a user such as a security guard visually confirms a person to be detected such as a person requiring attention detected by the device, and the detection condition is changed so that the person can be easily detected again. An example will be described. The user visually confirms the person detected by the device from the image taken by a certain camera, and if it is confirmed that the person is the person to be detected, it is easier to detect again from the image taken by that camera and other cameras. In addition, the detection conditions related to the detection of the person are changed. It should be noted that the user has a mobile terminal, and the detection result by the device is acquired via the mobile terminal and visually confirmed.

図１は、第１の実施形態における情報処理システムとしての検知システムの構成例を示す図である。図１に示す検知システムは、情報処理装置１０１、ネットワーク１０２、カメラ１０３、１０４、及び携帯端末１０５を有する。なお、図１に示す構成は一例であり、本実施形態はこれに限定されるものではなく、例えばネットワーク１０２を介して接続されるカメラや携帯端末の数は任意である。また、携帯端末１０５とは別に、装置による検知結果を表示するモニター等の表示装置やユーザが目視確認の結果を入力する入力装置等を有していてもよい。 FIG. 1 is a diagram showing a configuration example of a detection system as an information processing system according to the first embodiment. The detection system shown in FIG. 1 includes an information processing device 101, a network 102, cameras 103 and 104, and a mobile terminal 105. The configuration shown in FIG. 1 is an example, and the present embodiment is not limited to this. For example, the number of cameras and mobile terminals connected via the network 102 is arbitrary. Further, apart from the mobile terminal 105, a display device such as a monitor for displaying the detection result by the device, an input device for the user to input the result of visual confirmation, and the like may be provided.

情報処理装置１０１は、ネットワーク１０２を介して、カメラ１０３、１０４及び携帯端末１０５等の各装置と通信し、各装置からの入力等に基づいて所定の処理を行って処理結果を各装置等に出力する。情報処理装置１０１は、例えば、取得した画像から検知対象の人物を検知し、検知結果を各装置へ出力する。 The information processing device 101 communicates with each device such as the cameras 103, 104 and the mobile terminal 105 via the network 102, performs a predetermined process based on the input from each device, and sends the processing result to each device or the like. Output. The information processing device 101 detects, for example, a person to be detected from the acquired image, and outputs the detection result to each device.

ネットワーク１０２は、例えばローカルエリアネットワークであり、各装置間の通信に用いる。なお、ネットワーク１０２は、ローカルエリアネットワークに限定されるものではなく、他の通信ネットワークを用いてもよい。なお、ネットワーク１０２は、無線接続の通信ネットワークであってもよいし、有線接続の通信ネットワークであってもよいし、無線接続及び有線接続がともに可能な通信ネットワークであってもよい。 The network 102 is, for example, a local area network, and is used for communication between each device. The network 102 is not limited to the local area network, and other communication networks may be used. The network 102 may be a wirelessly connected communication network, a wiredly connected communication network, or a communication network capable of both wirelessly and wiredly connected.

カメラ１０３、１０４は、レンズ及びＣＣＤ、ＣＭＯＳセンサ等の撮像素子から構成されるカメラユニットと、ネットワーク１０２に接続するための通信装置とを有する。カメラ１０３、１０４は、カメラユニットにより撮影した画像を、ネットワーク１０２を介して情報処理装置１０１へ出力する。カメラとして、通信機能を有する他のカメラを用いてもよい。また、カメラユニットは可視光に限らず、赤外カメラユニット等であってもよい。カメラ１０３、１０４は、例えば場所に設定される監視カメラである。 The cameras 103 and 104 include a camera unit composed of a lens and an image sensor such as a CCD and a CMOS sensor, and a communication device for connecting to the network 102. The cameras 103 and 104 output the image captured by the camera unit to the information processing device 101 via the network 102. As the camera, another camera having a communication function may be used. Further, the camera unit is not limited to visible light, and may be an infrared camera unit or the like. Cameras 103 and 104 are, for example, surveillance cameras set at a location.

携帯端末１０５は、タッチパネルの入出力装置と表示装置及び通信装置を有する。携帯端末１０５は、情報処理装置１０１による検知結果を表示し、ユーザによる目視確認の結果を、ネットワーク１０２を介して情報処理装置１０１に出力する。本実施形態では、携帯端末１０５の通信装置は無線通信ユニットである。 The mobile terminal 105 has a touch panel input / output device, a display device, and a communication device. The mobile terminal 105 displays the detection result of the information processing device 101, and outputs the result of visual confirmation by the user to the information processing device 101 via the network 102. In the present embodiment, the communication device of the mobile terminal 105 is a wireless communication unit.

図１に示した検知システムにおける各装置の構成や数は、これに限るものではない。また、前述した説明では、それぞれの装置が別個のものとして構成されているが、複数の機能を有する１つの装置で構成するようにしてもよい。例えば、カメラ１０３に情報処理装置１０１を内蔵するようにしてもよい。 The configuration and number of devices in the detection system shown in FIG. 1 are not limited to this. Further, in the above description, each device is configured as a separate device, but it may be configured by one device having a plurality of functions. For example, the information processing device 101 may be built in the camera 103.

（情報処理装置のハードウェア構成）
図２は、第１の実施形態における情報処理装置１０１のハードウェア構成例を示すブロック図である。情報処理装置１０１は、図２に示すように、ＣＰＵ２０１、ＲＯＭ２０２、ＲＡＭ２０３、二次記憶装置２０４、通信装置２０５、及び入出力装置２０６を有する。ＣＰＵ２０１、ＲＯＭ２０２、ＲＡＭ２０３、二次記憶装置２０４、通信装置２０５、及び入出力装置２０６は、接続バス２０７を介して、相互にデータ通信可能なように接続されている。 (Hardware configuration of information processing device)
FIG. 2 is a block diagram showing a hardware configuration example of the information processing apparatus 101 according to the first embodiment. As shown in FIG. 2, the information processing device 101 includes a CPU 201, a ROM 202, a RAM 203, a secondary storage device 204, a communication device 205, and an input / output device 206. The CPU 201, ROM 202, RAM 203, secondary storage device 204, communication device 205, and input / output device 206 are connected to each other via a connection bus 207 so that data can be communicated with each other.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１は、ＲＯＭ２０２やＲＡＭ２０３に格納された制御プログラムを実行することにより、本装置や本実施形態における検知システムの制御を行う。ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２０２は、不揮発性メモリであり、制御プログラムや各種のパラメタデータ等を記憶する。制御プログラムは、ＣＰＵ２０１により実行され、後述する各処理が実現される。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２０３は、揮発性メモリであり、画像や制御プログラム及びその実行結果等を一時的に記憶する。 The CPU (Central Processing Unit) 201 controls the present device and the detection system in the present embodiment by executing a control program stored in the ROM 202 or the RAM 203. The ROM (Read Only Memory) 202 is a non-volatile memory, and stores a control program, various parameter data, and the like. The control program is executed by the CPU 201, and each process described later is realized. The RAM (Random Access Memory) 203 is a volatile memory, and temporarily stores an image, a control program, an execution result thereof, and the like.

二次記憶装置２０４は、ハードディスクやフラッシュメモリ等の書き換え可能な記憶装置であり、通信装置２０５を介して受信した、カメラ１０３、１０４からの画像等を記憶する。また、制御プログラム、各種設定内容、処理結果等を記憶する。これら情報は、ＣＰＵ１１が制御プログラムに従って処理を実行する際に、二次記憶装置２０４からＲＡＭ２０３に出力されて用いられる。通信装置２０５は、ネットワーク１０２を介して各種装置と通信を行う。通信装置２０５は、有線通信ユニットであってもよいし、無線通信ユニットであってもよい。入出力装置２０６は、外部とのインターフェースとなる装置である。入出力装置２０６は、画像や制御プログラムの実行結果等を表示するディスプレイとユーザからの入力を取得するマウスやキーボード等で構成される。 The secondary storage device 204 is a rewritable storage device such as a hard disk or a flash memory, and stores images and the like from the cameras 103 and 104 received via the communication device 205. It also stores control programs, various setting contents, processing results, and the like. This information is output from the secondary storage device 204 to the RAM 203 and used when the CPU 11 executes the process according to the control program. The communication device 205 communicates with various devices via the network 102. The communication device 205 may be a wired communication unit or a wireless communication unit. The input / output device 206 is a device that serves as an interface with the outside. The input / output device 206 includes a display that displays an image, an execution result of a control program, and the like, and a mouse, keyboard, and the like that acquire input from the user.

本実施形態では、後述する各処理を、ＣＰＵ２０１を用いてソフトウェアで実現することとするが、処理の一部又は全部をハードウェアで実現するようにしても構わない。ハードウェアとして、専用回路（ＡＳＩＣ）やプロセッサ（リコンフィギュラブルプロセッサ、ＤＳＰ）等を用いることができる。また、後述する処理を記述したソフトウェアをネットワーク又は各種記憶媒体を介して取得し、パーソナルコンピュータ等の処理装置（ＣＰＵ、プロセッサ）にて実行するようにしてもよい。 In the present embodiment, each process described later is realized by software using the CPU 201, but a part or all of the processes may be realized by hardware. As hardware, a dedicated circuit (ASIC), a processor (reconfigurable processor, DSP), or the like can be used. Further, software describing the processing described later may be acquired via a network or various storage media and executed by a processing device (CPU, processor) such as a personal computer.

（情報処理装置の機能構成）
図３は、第１の実施形態における情報処理装置１０１の機能構成例を示すブロック図である。情報処理装置１０１は、図３に示すように、画像取得部３０１、検知部３０２、通知部３０３、検知条件変更部３０４、及び確認結果取得部３０５を有する。 (Functional configuration of information processing device)
FIG. 3 is a block diagram showing a functional configuration example of the information processing apparatus 101 according to the first embodiment. As shown in FIG. 3, the information processing device 101 includes an image acquisition unit 301, a detection unit 302, a notification unit 303, a detection condition change unit 304, and a confirmation result acquisition unit 305.

画像取得部３０１は、カメラ１０３又はカメラ１０４によって撮影された画像を取得して検知部３０２へ出力する。検知部３０２は、画像取得部３０１が取得した画像から検知対象の人物を検知し、検知結果を通知部３０３へ出力する。画像から検知対象の人物を検知する際の検知条件は、事前に定められており、検知部３０２は、要求に応じて検知条件変更部３０４から変更された検知条件を取得する。 The image acquisition unit 301 acquires an image captured by the camera 103 or the camera 104 and outputs the image to the detection unit 302. The detection unit 302 detects a person to be detected from the image acquired by the image acquisition unit 301, and outputs the detection result to the notification unit 303. The detection conditions for detecting a person to be detected from an image are predetermined, and the detection unit 302 acquires the changed detection conditions from the detection condition changing unit 304 in response to a request.

通知部３０３は、検知部１０２から出力された検知結果を携帯端末１０５に通知する。通知された検知部３０２での検知結果を携帯端末１０５において表示することで、携帯端末１０５の使用者は、携帯端末１０５を介して、検知部３０２による検知対象の人物の検知結果を確認することができる。検知条件変更部３０４は、確認結果取得部３０５が取得したユーザによる目視確認の結果に基づいて検知対象の人物の検知条件を変更し、検知部３０２に出力する。確認結果取得部３０５は、ユーザによる目視確認の結果を携帯端末１０５から取得して検知条件変更部３０４へ出力する。 The notification unit 303 notifies the mobile terminal 105 of the detection result output from the detection unit 102. By displaying the detection result of the notified detection unit 302 on the mobile terminal 105, the user of the mobile terminal 105 confirms the detection result of the person to be detected by the detection unit 302 via the mobile terminal 105. Can be done. The detection condition changing unit 304 changes the detection condition of the person to be detected based on the result of the visual confirmation by the user acquired by the confirmation result acquisition unit 305, and outputs the detection condition to the detection unit 302. The confirmation result acquisition unit 305 acquires the result of visual confirmation by the user from the mobile terminal 105 and outputs it to the detection condition changing unit 304.

本実施形態では、情報処理装置１０１が、携帯端末１０５に検知結果を通知したり、目視確認の結果を携帯端末１０５から取得したりするようにしているが、携帯端末１０５とは異なる他の機器を用いるようにしてもよい。例えば、検知結果を情報処理装置１０１の入出力装置２０６を使ってユーザに通知するようにしてもよいし、この他の別の機器であってもよい。 In the present embodiment, the information processing device 101 notifies the mobile terminal 105 of the detection result and obtains the result of visual confirmation from the mobile terminal 105, but other devices different from the mobile terminal 105. May be used. For example, the detection result may be notified to the user by using the input / output device 206 of the information processing device 101, or another device may be used.

図４は、図３に示した検知部３０２の機能構成例を示すブロック図である。検知部３０２は、検出部４０１、特徴抽出部４０２、辞書登録部４０３、類似度算出部４０４、検知結果決定部４０５、及び検知条件取得部４０６を有する。 FIG. 4 is a block diagram showing a functional configuration example of the detection unit 302 shown in FIG. The detection unit 302 includes a detection unit 401, a feature extraction unit 402, a dictionary registration unit 403, a similarity calculation unit 404, a detection result determination unit 405, and a detection condition acquisition unit 406.

検出部４０１は、画像取得部３０１が取得した画像から所定のオブジェクトを検出し、検出結果を特徴抽出部４０２へ出力する。本実施形態では、検出部４０１は、オブジェクトとして人物の顔を検出し、検出した人物の顔を内部に含む領域の矩形座標を検出結果として出力する。検出結果はこれに限られるものではなく、検出した顔の中心座標であってもよい。特徴抽出部４０２は、検出部４０１から出力された検出結果に基づいて、画像から検出された人物の顔画像の特徴量を抽出し、抽出した特徴量を辞書登録部４０３又は類似度算出部４０４へ出力する。 The detection unit 401 detects a predetermined object from the image acquired by the image acquisition unit 301, and outputs the detection result to the feature extraction unit 402. In the present embodiment, the detection unit 401 detects the face of a person as an object, and outputs the rectangular coordinates of the area including the face of the detected person as the detection result. The detection result is not limited to this, and may be the center coordinates of the detected face. The feature extraction unit 402 extracts the feature amount of the face image of the person detected from the image based on the detection result output from the detection unit 401, and the extracted feature amount is the dictionary registration unit 403 or the similarity calculation unit 404. Output to.

辞書登録部４０４は、特徴抽出部４０２により抽出された特徴量を登録辞書として二次記憶装置２０４に記憶し、類似度算出部４０４へ出力する。辞書登録部４０４は、抽出された特徴量を登録辞書としてＲＡＭ２０３に記憶するようにしてもよい。登録辞書とは、検知対象の人物の顔画像と、その特徴量と、対応する人物ＩＤ（個人を識別するために割り振られた番号や記号等の識別子）とを紐付けたデータである。人物ＩＤは、例えば入出力装置２０６から取得する。登録辞書には、複数の人物の特徴量が含まれてもよいし、一人の人物について複数の特徴量を含んでもよい。 The dictionary registration unit 404 stores the feature amount extracted by the feature extraction unit 402 as a registration dictionary in the secondary storage device 204, and outputs the feature amount to the similarity calculation unit 404. The dictionary registration unit 404 may store the extracted feature amount in the RAM 203 as a registration dictionary. The registered dictionary is data in which a face image of a person to be detected, a feature amount thereof, and a corresponding person ID (identifier such as a number or a symbol assigned to identify an individual) are associated with each other. The person ID is acquired from, for example, the input / output device 206. The registered dictionary may include a plurality of features of a person, or may include a plurality of features for one person.

類似度算出部４０４は、特徴抽出部４０２により抽出された特徴量と辞書登録部４０３から取得した登録辞書に登録されている特徴量との間の類似度を算出し、算出した類似度を検知結果決定部４０５へ出力する。検知条件取得部４０６は、検知条件変更部３０４から検知対象の人物の検知条件を取得して検知結果決定部４０５へ出力する。 The similarity calculation unit 404 calculates the similarity between the feature amount extracted by the feature extraction unit 402 and the feature amount registered in the registered dictionary acquired from the dictionary registration unit 403, and detects the calculated similarity. Output to the result determination unit 405. The detection condition acquisition unit 406 acquires the detection condition of the person to be detected from the detection condition change unit 304 and outputs it to the detection result determination unit 405.

検知結果決定部４０５は、類似度算出部４０４から出力された類似度と検知条件取得部４０６が取得した検知条件に基づいて、検知結果、すなわち画像取得部３０１が取得した画像から検知対象の人物を検知できたか否かを決定する。本実施形態では、検知結果決定部４０５は、検出部４０１で画像から検出された人物の顔が、辞書登録部４０３に事前に登録された対象の人物であるか否かを決定する。検知結果決定部４０５は、この検知結果を登録辞書に含まれる顔画像と紐づけて、通知部３０３へ出力する。 The detection result determination unit 405 is a person to be detected from the detection result, that is, the image acquired by the image acquisition unit 301, based on the similarity output from the similarity calculation unit 404 and the detection condition acquired by the detection condition acquisition unit 406. Determine if it could be detected. In the present embodiment, the detection result determination unit 405 determines whether or not the face of the person detected from the image by the detection unit 401 is the target person registered in advance in the dictionary registration unit 403. The detection result determination unit 405 associates this detection result with the face image included in the registration dictionary and outputs it to the notification unit 303.

第１の実施形態における検知システムでの検知対象の人物の検知に係る処理は、検知対象の人物を検知する検知処理、検知条件の変更に係る条件変更処理、及び検知対象の人物の登録に係る辞書登録処理の３つの処理を含む。以下、それぞれの処理について説明する。 The process related to the detection of the person to be detected by the detection system in the first embodiment relates to the detection process for detecting the person to be detected, the condition change process for changing the detection condition, and the registration of the person to be detected. It includes three processes of dictionary registration process. Each process will be described below.

（検知処理）
まず、本実施形態における検知システムにおいて、カメラ１０３、１０４により画像を撮影しつつ、撮影された画像から情報処理装置１０１が検知対象の人物を検知する処理（検知処理）について説明する。図５は、第１の実施形態における検知処理の例を示すフローチャートである。 (Detection processing)
First, in the detection system of the present embodiment, a process (detection process) in which the information processing device 101 detects a person to be detected from the captured image while capturing an image with the cameras 103 and 104 will be described. FIG. 5 is a flowchart showing an example of the detection process according to the first embodiment.

検知処理では、まず、ステップＳ５０１において、画像取得部３０１は、検知処理の終了指示があるか否かを判定する。この検知処理の終了指示は入出力装置２０６から取得する。検知処理の終了指示があると画像取得部３０１が判定した場合（Ｙｅｓ）、図５に示す検知処理を終了する。一方、検知処理の終了指示がないと画像取得部３０１が判定した場合（Ｎｏ）、ステップＳ５０２において、画像取得部３０１は、カメラ１０３、１０４によって撮影された画像を取得する。 In the detection process, first, in step S501, the image acquisition unit 301 determines whether or not there is an instruction to end the detection process. The end instruction of this detection process is acquired from the input / output device 206. When the image acquisition unit 301 determines that there is an instruction to end the detection process (Yes), the detection process shown in FIG. 5 is terminated. On the other hand, when the image acquisition unit 301 determines that there is no end instruction of the detection process (No), the image acquisition unit 301 acquires the image captured by the cameras 103 and 104 in step S502.

続いて、ステップＳ５０３において、検知部３０２は、ステップＳ５０２において取得した画像を用いて、画像から検知対象の人物を検知するための対象物検知処理を行う。このステップＳ５０３における対象物検知処理については後述する。 Subsequently, in step S503, the detection unit 302 performs an object detection process for detecting a person to be detected from the image using the image acquired in step S502. The object detection process in step S503 will be described later.

次に、ステップＳ５０４において、検知部３０２は、ステップＳ５０３での対象物検知処理において、画像から検知対象の人物を検知したか否かを判定する。検知対象の人物のうちの誰かを画像から検知したと検知部３０２が判定した場合（Ｙｅｓ）、ステップＳ５０５において、検知部３０２は、検知結果を通知部３０３へ出力する。一方、検知対象の人物の誰も検出していないと検知部３０２が判定した場合（Ｎｏ）、処理はステップＳ５０１に戻る。 Next, in step S504, the detection unit 302 determines whether or not the person to be detected is detected from the image in the object detection process in step S503. When the detection unit 302 determines that any of the people to be detected is detected from the image (Yes), the detection unit 302 outputs the detection result to the notification unit 303 in step S505. On the other hand, when the detection unit 302 determines that no person to be detected has been detected (No), the process returns to step S501.

ステップＳ５０５において、通知部３０３は、検知部３０２から出力された検知結果を携帯端末１０５に通知する。検知結果は、検知した人物ＩＤとともに、登録辞書に登録された顔画像及び画像から検知した際の顔画像を紐づけて通知される。検知結果の通知の方法は、前述の通り、無線通信を用いて携帯端末１０５へ送信する。これらの情報を携帯端末１０５で表示することにより、携帯端末１０５の使用者は、携帯端末１０５を介して、検知部３０２による検知結果を確認することができる。通知する内容については、条件変更処理の説明において詳しく説明する。ステップＳ５０５において通知部３０３が検知結果を携帯端末１０５に通知すると、処理はステップＳ５０１に戻る。 In step S505, the notification unit 303 notifies the mobile terminal 105 of the detection result output from the detection unit 302. The detection result is notified by associating the detected person ID with the face image registered in the registration dictionary and the face image at the time of detection from the image. As described above, the method of notifying the detection result is transmitted to the mobile terminal 105 using wireless communication. By displaying this information on the mobile terminal 105, the user of the mobile terminal 105 can confirm the detection result by the detection unit 302 via the mobile terminal 105. The content to be notified will be described in detail in the description of the condition change process. When the notification unit 303 notifies the mobile terminal 105 of the detection result in step S505, the process returns to step S501.

ステップＳ５０１において検知処理の終了指示があると画像取得部３０１により判定されるまで、前述した処理を繰り返す。以上が、検知処理である。 The above-mentioned processing is repeated until the image acquisition unit 301 determines that there is an instruction to end the detection processing in step S501. The above is the detection process.

図６は、図５のステップＳ５０３において検知部３０２が実行する対象物検知処理の例を示すフローチャートである。
対象物検知処理を開始すると、ステップＳ６０１において、検出部４０１は、図５のステップＳ５０２において取得した画像に対する検出処理を行い、画像から人物の顔を検出する。画像から人物の顔を検出する方法は公知の技術を用いればよく、例えば、下記の文献Ａに記載の技術を用いることができる。 FIG. 6 is a flowchart showing an example of an object detection process executed by the detection unit 302 in step S503 of FIG.
When the object detection process is started, in step S601, the detection unit 401 performs detection processing on the image acquired in step S502 of FIG. 5 to detect a person's face from the image. As a method for detecting a person's face from an image, a known technique may be used. For example, the technique described in Document A below can be used.

文献Ａ：Ｐ．ＶｉｏｌａａｎｄＭ．Ｊｏｎｅｓ，“Ｒｏｂｕｓｔｒｅａｌ−ｔｉｍｅｆａｃｅｄｅｔｅｃｔｉｏｎ”，ｐｐ．７４７，ＥｉｇｈｔｈＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ（ＩＣＣＶ’０１），Ｖｏｌｕｍｅ２，２００１． Reference A: P.I. Viola and M. Jones, "Robust real-time face detection", pp. 747, Eighth International Conference on Computer Vision (ICCV'01), Volume 2, 2001.

このような顔検出技術によって検出した顔の画像上の座標に基づいて、全体画像から顔の画像（顔画像）を切り出す。このとき、画像面に対する顔の面内回転を一定にするために画像正規化を施す。例えば、顔の両目をつなぐ直線が、画像に対して水平になるように画像を回転させる等の処理を施してもよい。本実施形態では、取得した画像中の全範囲を処理対象とするが、取得した画像中での範囲をあらかじめ指定しておき、その範囲内に写っている顔のみを処理対象とするようにしてもよい。 A face image (face image) is cut out from the entire image based on the coordinates on the face image detected by such a face detection technique. At this time, image normalization is performed in order to make the in-plane rotation of the face with respect to the image surface constant. For example, the image may be rotated so that the straight line connecting both eyes of the face is horizontal to the image. In the present embodiment, the entire range in the acquired image is targeted for processing, but the range in the acquired image is specified in advance, and only the faces in the range are targeted for processing. May be good.

次に、ステップＳ６０２において、特徴抽出部４０２は、ステップＳ６０１において検出された顔画像に対する特徴抽出処理を行い、顔画像の特徴量を抽出する。画像の特徴量を抽出する方法は公知の技術を用いればよく、例えば、ＬＢＰ（ＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎ）特徴量を用いることができる。また、ＨＯＧ（ＨｉｓｔｏｇｒａｍｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓ）特徴量やＳＩＦＴ（Ｓｃａｌｅ−ＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍ）特徴量を用いてもよい。また、これらを混合した特徴量を用いてもよい。事前に学習したＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）で抽出した特徴量を用いてもよい。また、抽出した特徴量を、ＰＣＡ（ＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔＡｎａｌｙｓｉｓ）等を使って次元圧縮してもよい。本実施形態では、顔画像全体から特徴量を抽出するが、顔画像の部分領域から特徴量を抽出するようにしてもよい。例えば、顔の目や鼻といった器官位置を検出し、検出した器官位置を基に部分領域を設定し、各部分領域から特徴量を抽出するようにしてもよい。 Next, in step S602, the feature extraction unit 402 performs a feature extraction process on the face image detected in step S601 to extract the feature amount of the face image. As a method for extracting the feature amount of the image, a known technique may be used, and for example, an LBP (Local Binary Pattern) feature amount can be used. Further, a HOG (Histogram of Oriented Gradients) feature quantity or a SIFT (Scale-Invariant Feature Transition) feature quantity may be used. Moreover, you may use the feature quantity which mixed these. The feature amount extracted by CNN (Convolutional Neural Network) learned in advance may be used. Further, the extracted feature amount may be dimensionally compressed using PCA (Principal Component Analysis) or the like. In the present embodiment, the feature amount is extracted from the entire face image, but the feature amount may be extracted from a partial region of the face image. For example, the position of an organ such as the eyes or nose of the face may be detected, a partial region may be set based on the detected organ position, and a feature amount may be extracted from each partial region.

次に、ステップＳ６０３において、類似度算出部４０４は、検知対象の人物に係る情報が登録されている登録辞書を取得する。登録辞書は、前述の通り、検知対象の人物（認証する人物）の特徴量を事前に登録しておいたものであり、複数の人物の特徴量を含む。ここでは、検知対象の人物に係る情報の登録は事前に完了しているものとする。検知対象の人物に係る情報を登録辞書に登録する処理（辞書登録処理）については後述する。 Next, in step S603, the similarity calculation unit 404 acquires a registration dictionary in which information relating to the person to be detected is registered. As described above, the registration dictionary is one in which the feature amounts of the person to be detected (the person to be authenticated) are registered in advance, and includes the feature amounts of a plurality of persons. Here, it is assumed that the registration of the information related to the person to be detected has been completed in advance. The process of registering the information related to the person to be detected in the registration dictionary (dictionary registration process) will be described later.

次に、ステップＳ６０４において、類似度算出部４０４は、ステップＳ６０２の特徴抽出処理で抽出された特徴量と、ステップＳ６０３において取得した登録辞書に登録されている特徴量とのすべての組み合わせについて類似度を算出する。類似度を算出する方法は公知の技術を用いればよく、例えば、下記の（式１）に示すコサイン類似度を用いることができる。（式１）において、ｘ，ｙは特徴量ベクトルであり、Ｓは特徴量ベクトルｘ，ｙ間の類似度であり、・は内積を意味する。 Next, in step S604, the similarity calculation unit 404 determines the similarity for all combinations of the feature amount extracted by the feature extraction process in step S602 and the feature amount registered in the registration dictionary acquired in step S603. Is calculated. As a method for calculating the similarity, a known technique may be used. For example, the cosine similarity shown in the following (Equation 1) can be used. In (Equation 1), x and y are feature vectors, S is the similarity between feature vectors x and y, and · means an inner product.

一人の人物に対して複数の特徴量が登録辞書に登録されている場合、類似度算出部４０４は、それぞれについて特徴抽出処理で抽出した特徴量との類似度を算出し、最大の類似度をその人物に関する類似度として採用する。これにより、類似度算出部４０４は、登録辞書に登録されている人物の数と同じ数の類似度を計算する。一人の人物に対して複数の特徴量が登録辞書に登録されている場合の処理は、この他の方法を用いてもよい。例えば、登録されている複数の特徴量を１つの特徴量に統合してから類似度を算出するようにしてもよいし、複数の類似度を算出した上で検知結果を決定する際に、複数の類似度に基づいて検知結果を決定するようにしてもよい。 When a plurality of feature quantities are registered in the registration dictionary for one person, the similarity calculation unit 404 calculates the similarity degree with the feature quantity extracted by the feature extraction process for each of them, and obtains the maximum similarity degree. Adopt as a similarity for that person. As a result, the similarity calculation unit 404 calculates the same number of similarity as the number of persons registered in the registration dictionary. Other methods may be used for the processing when a plurality of feature quantities are registered in the registration dictionary for one person. For example, a plurality of registered feature quantities may be integrated into one feature quantity and then the similarity may be calculated, or when determining the detection result after calculating the plurality of similarities, a plurality of features may be calculated. The detection result may be determined based on the similarity of.

次に、ステップＳ６０５において、検知結果決定部４０５は、ステップＳ６０４において算出された類似度と、検知条件取得部４０６により取得された検知条件とに基づいて、検知結果を決定する。本実施形態では、検知条件は閾値であり、検知結果決定部４０５は、登録辞書に登録されている人物毎の類似度について、閾値を越えている場合には、その人物を検知したと判定し、閾値以下である場合には、その人物ではないと判定する。すなわち、検知結果決定部４０５は、人物毎の類似度Ｓに対して、閾値θとして、Ｓ＞θを満たす人物ＩＤを検知結果として決定する。複数の人物について類似度が閾値を越えた場合には、閾値が大きい方の人物ＩＤを選択することで検知結果を一人の人物に絞り込む。なお、複数の人物について類似度が閾値を越えた場合の絞り込みの方法は、他の方法でもよい。以上が、対象物検知処理の説明である。 Next, in step S605, the detection result determination unit 405 determines the detection result based on the similarity calculated in step S604 and the detection condition acquired by the detection condition acquisition unit 406. In the present embodiment, the detection condition is a threshold value, and the detection result determination unit 405 determines that the person is detected when the similarity degree of each person registered in the registration dictionary exceeds the threshold value. If it is less than or equal to the threshold value, it is determined that the person is not the person. That is, the detection result determination unit 405 determines a person ID satisfying S> θ as a detection result with respect to the similarity S for each person as a threshold value θ. When the similarity exceeds the threshold value for a plurality of people, the detection result is narrowed down to one person by selecting the person ID having the larger threshold value. In addition, the method of narrowing down when the similarity exceeds the threshold value for a plurality of persons may be another method. The above is the description of the object detection process.

（条件変更処理）
次に、前述した検知処理で検知結果がユーザに対して通知された後の検知条件に係る処理（条件変更処理）について説明する。この条件変更処理は、前述した検知処理と並行するように実行され、検知処理によって検知した人物をユーザが目視確認した結果に応じて検知条件を変更する。図７は、第１の実施形態における条件変更処理の例を示すフローチャートである。 (Condition change processing)
Next, the process related to the detection condition (condition change process) after the detection result is notified to the user in the above-mentioned detection process will be described. This condition change process is executed in parallel with the detection process described above, and the detection condition is changed according to the result of the user visually confirming the person detected by the detection process. FIG. 7 is a flowchart showing an example of the condition change process in the first embodiment.

ステップＳ７０１において、確認結果取得部３０５は、前述した検知処理で検知結果が携帯端末１０５に通知されたか否かを判定する。検知結果が携帯端末１０５に通知されたと確認結果取得部３０５が判定した場合（Ｙｅｓ）、続く処理に進む。一方、検知結果が携帯端末１０５に通知されていないと確認結果取得部３０５が判定した場合（Ｎｏ）、検知結果が通知されるまで待つ。 In step S701, the confirmation result acquisition unit 305 determines whether or not the detection result has been notified to the mobile terminal 105 by the detection process described above. When the confirmation result acquisition unit 305 determines that the detection result has been notified to the mobile terminal 105 (Yes), the process proceeds to the next process. On the other hand, when the confirmation result acquisition unit 305 determines that the detection result has not been notified to the mobile terminal 105 (No), it waits until the detection result is notified.

続いて、ステップＳ７０２において、検知結果を受けた携帯端末１０５では検知結果を表示して、携帯端末１０５のユーザに検知結果の目視確認を促す。具体的には、図８に一例を示すように携帯端末１０５に検知結果を表示し、携帯端末１０５に表示された検知結果が正しいか否か、すなわち検知された人物が検知対象の人物であるか否かの目視確認をユーザに促す。携帯端末１０５は、ユーザによる目視確認の結果の入力を受け付けて情報処理装置１０１に通知する。 Subsequently, in step S702, the mobile terminal 105 that has received the detection result displays the detection result and prompts the user of the mobile terminal 105 to visually confirm the detection result. Specifically, as shown in FIG. 8, the detection result is displayed on the mobile terminal 105, and whether or not the detection result displayed on the mobile terminal 105 is correct, that is, the detected person is the person to be detected. Prompt the user to visually confirm whether or not. The mobile terminal 105 accepts the input of the result of the visual confirmation by the user and notifies the information processing apparatus 101.

図８は、携帯端末１０５に表示される検知結果の表示例を説明する模式図である。図８に示すように、例えば検知結果８０１、検知された登録辞書８０２、検知された人物の顔枠８０３、検知結果を目視確認した結果の入力に用いるボタン８０４、８０５、及び登録辞書に登録された顔画像８０６、８０７等が表示される。目視確認の結果の入力ボタンには、検知結果が正しいと判定した場合に用いられる＜正しい＞ボタン８０４と、検知結果が誤りであると判定した場合に用いられる＜誤り＞ボタン８０５とを含む。また、登録辞書に顔画像が複数登録されている場合、図８に示すように複数の顔画像８０６、８０７が画面内に表示される。 FIG. 8 is a schematic diagram illustrating a display example of the detection result displayed on the mobile terminal 105. As shown in FIG. 8, for example, the detection result 801 and the detected registration dictionary 802, the face frame 803 of the detected person, the buttons 804 and 805 used to input the result of visually confirming the detection result, and the registration dictionary are registered. Face images 806, 807, etc. are displayed. The visual confirmation result input button includes a <correct> button 804 used when it is determined that the detection result is correct, and an <error> button 805 used when it is determined that the detection result is incorrect. When a plurality of face images are registered in the registration dictionary, the plurality of face images 806 and 807 are displayed on the screen as shown in FIG.

検知結果８０１は、カメラ１０３又はカメラ１０４によって撮影された画像である。カメラ１０３、１０４によって撮影される画像には複数の人物が写る可能性があるため、検知された人物の顔の位置に顔枠８０３を表示している。この例では、人物ＩＤ“０００１”として登録されている人物を検知したことを示しており、登録されている顔画像８０６、８０７が登録辞書８０２に並べて表示されている。このように表示することで、携帯端末１０５のユーザは、検知結果が正しいか否かを簡易に判断できる。第１の顔画像８０６及び第２の顔画像８０７は、撮影した時刻や場所等が異なっており、撮影条件が異なるために写り方が異なる。このように、様々な写り方の顔画像を表示することで目視確認の助けとなる。携帯端末１０５において表示する顔画像の数は、この例に限られるものではない。ユーザは、携帯端末１０５に表示された検知結果８０１と登録辞書８０２とを比較し、検知結果が正しいと判定すれば＜正しい＞ボタン８０４を選択し、誤りと判定すれば＜誤り＞ボタン８０５を選択して、目視確認の結果を入力する。なお、要注意人物検知や迷子検知等に適用し、警備員等のユーザが、最終的に目視確認した人物のもとに向かう必要がある場合には、対象の人物がいる場所を示した地図等を携帯端末１０５で同時に表示させるようにするとよい。 The detection result 801 is an image taken by the camera 103 or the camera 104. Since there is a possibility that a plurality of people are captured in the images taken by the cameras 103 and 104, the face frame 803 is displayed at the position of the face of the detected person. In this example, it is shown that the person registered as the person ID "0001" is detected, and the registered face images 806 and 807 are displayed side by side in the registration dictionary 802. By displaying in this way, the user of the mobile terminal 105 can easily determine whether or not the detection result is correct. The first face image 806 and the second face image 807 are different in the time and place of shooting, and the appearance is different because the shooting conditions are different. In this way, displaying facial images in various ways helps visual confirmation. The number of face images displayed on the mobile terminal 105 is not limited to this example. The user compares the detection result 801 displayed on the mobile terminal 105 with the registration dictionary 802, selects the <correct> button 804 if the detection result is determined to be correct, and presses the <error> button 805 if the detection result is determined to be incorrect. Select and enter the result of visual confirmation. In addition, when it is necessary to apply to the detection of a person requiring attention, the detection of a lost child, etc., and the user such as a security guard needs to go to the person who was finally visually confirmed, a map showing the location of the target person. Etc. may be displayed simultaneously on the mobile terminal 105.

図７に戻り、ステップＳ７０３において、確認結果取得部３０５は、検知結果についての携帯端末１０５のユーザによる目視確認の結果を携帯端末１０５から取得する。 Returning to FIG. 7, in step S703, the confirmation result acquisition unit 305 acquires the result of visual confirmation by the user of the mobile terminal 105 about the detection result from the mobile terminal 105.

次に、ステップＳ７０４において、検知条件変更部３０４は、ステップＳ７０３において取得された目視確認の結果が、正しいと判定されたことを示す＜正しい＞であるのか、誤りと判定されたことを示す＜誤り＞であるのかを判定する。目視確認の結果が＜正しい＞であると検知条件変更部３０４が判定した場合（Ｙｅｓ）、処理はステップＳ７０５に進む。一方、目視確認の結果が＜誤り＞であると検知条件変更部３０４が判定した場合（Ｎｏ）、この検知結果についての条件変更処理は終了し、新たな検知結果の通知が行われるのを待つ。 Next, in step S704, the detection condition changing unit 304 indicates whether the result of the visual confirmation acquired in step S703 is <correct> indicating that it is determined to be correct, or <indicating that it is determined to be incorrect. Determine if error>. When the detection condition changing unit 304 determines that the result of the visual confirmation is <correct> (Yes), the process proceeds to step S705. On the other hand, when the detection condition changing unit 304 determines that the result of the visual confirmation is <error> (No), the condition changing process for this detection result is completed, and a new detection result is notified. ..

ステップＳ７０５において、検知条件変更部３０４は、検知条件の設定変更の指示があるか否かを確認する。検知条件の設定変更の指示は、情報処理装置１０１の入出力装置２０６を介して取得するようにしてもよいし、携帯端末１０５から取得するようにしてもよい。このようなシステムを運用する上で、状況次第で検知条件を変更したくない場合もあるので、目視確認で検知結果が正しいと判定された場合に一律で検知条件を変更するのではなく、検知条件を変更するか確認することで利便性が高まる。検知条件の設定変更の指示があると検知条件変更部３０４が判定した場合には（Ｙｅｓ）、処理はステップＳ７０６に進む。検知条件の設定変更の指示がないと検知条件変更部３０４が判定した場合には（Ｎｏ）、この検知結果についての条件変更処理は終了し、新たな検知結果の通知が行われるのを待つ。 In step S705, the detection condition changing unit 304 confirms whether or not there is an instruction to change the setting of the detection condition. The instruction for changing the setting of the detection condition may be acquired via the input / output device 206 of the information processing device 101, or may be acquired from the mobile terminal 105. When operating such a system, it may not be necessary to change the detection conditions depending on the situation. Therefore, when the detection result is determined to be correct by visual confirmation, the detection conditions are not changed uniformly, but are detected. Convenience is enhanced by confirming whether to change the conditions. If the detection condition changing unit 304 determines that there is an instruction to change the setting of the detection condition (Yes), the process proceeds to step S706. If the detection condition changing unit 304 determines that there is no instruction to change the setting of the detection condition (No), the condition change processing for the detection result is completed, and a new detection result is notified.

ステップＳ７０６において、検知条件変更部３０４は、人物の検知に係る検知条件を変更する。本実施形態では、検知条件変更部３０４は、検知結果決定部４０５が用いる検知条件である閾値を下記の（式２）に基づいて変更する。
θ’＝θ＋ｒ …（式２）
（式２）において、θ’は変更後の新しい閾値であり、θは変更前の閾値であり、ｒは変更量である。 In step S706, the detection condition changing unit 304 changes the detection condition related to the detection of a person. In the present embodiment, the detection condition changing unit 304 changes the threshold value, which is the detection condition used by the detection result determination unit 405, based on the following (Equation 2).
θ'= θ + r ... (Equation 2)
In (Equation 2), θ'is the new threshold value after the change, θ is the threshold value before the change, and r is the change amount.

変更量ｒは事前に計算しておく。ユーザによる目視確認によって検知対象の人物であると確認された人物を再度検知しやすくするため、本実施形態では、閾値を低くするように変更量ｒは負の値となる。検知条件である閾値を低くすると、人物を再度検知しやすくなる一方で、誤検知が多くなる可能性があるので、許容できる誤検知率になるよう、事前に特定のデータセット等を用いて変更量ｒをどの程度にすればよいか決めておくとよい。検知条件の変更の方法は、これに限定されるものではない。例えば、変更前の閾値に所定の係数を掛けたり、登録されている人物の数を基に計算したりして、検知条件を変更するようにしてもよい。 The change amount r is calculated in advance. In this embodiment, the change amount r is a negative value so as to lower the threshold value so that the person confirmed to be the person to be detected by the user's visual confirmation can be easily detected again. If the threshold value, which is the detection condition, is lowered, it becomes easier to detect the person again, but there is a possibility that false positives will increase. It is advisable to decide how much the quantity r should be. The method of changing the detection condition is not limited to this. For example, the detection condition may be changed by multiplying the threshold value before the change by a predetermined coefficient or calculating based on the number of registered persons.

ここで、変更した検知条件は、登録辞書に登録されているすべての人物の検知に適用するのではなく、目視確認によって検知対象の人物であると確認された人物の検知のみに適用する。こうすることで、その人物を再度検知しやすくなるだけでなく、その人物だけ検知条件（閾値）を緩和するので、すべての人物について一律に検知条件を緩和した場合と比較して誤検知の発生を低減することができる。 Here, the changed detection condition is not applied to the detection of all the persons registered in the registered dictionary, but is applied only to the detection of the person confirmed to be the person to be detected by visual confirmation. By doing so, not only is it easier to detect the person again, but also the detection condition (threshold value) is relaxed only for that person, so that false detection occurs as compared with the case where the detection condition is uniformly relaxed for all people. Can be reduced.

前述のようにして検知条件が変更されたことで、それ以降の検知処理は新しい検知条件を用いて行われる。すなわち、検知システムは、ユーザによる目視確認によって検知対象の人物であると確認された人物を、より検知しやすくなる。検知対象の人物であると確認された人物について、図６のＳ６０５で判定に用いる類似度の閾値が低くなるからである。本実施形態では、検知システムが有するすべてのカメラの画像を情報処理装置１０１で処理しているので情報処理装置１０１の検知条件のみを変更すればよいが、各カメラの内部で検知処理を行う場合には、変更後の検知条件をそれぞれのカメラに転送すればよい。以上が、条件変更処理の説明である。 Since the detection conditions have been changed as described above, the subsequent detection processing is performed using the new detection conditions. That is, the detection system can more easily detect a person who is confirmed to be a person to be detected by visual confirmation by the user. This is because the threshold value of the similarity used for the determination in S605 of FIG. 6 is lowered for the person confirmed to be the person to be detected. In the present embodiment, since the images of all the cameras of the detection system are processed by the information processing device 101, only the detection conditions of the information processing device 101 need to be changed, but when the detection process is performed inside each camera. The changed detection conditions may be transferred to each camera. The above is the description of the condition change process.

本実施形態における検知システムでは、カメラ１０３、１０４によって撮影された画像から検知処理で検知された人物を目視確認しており、ユーザは検知した人物のもとにすぐに向かうことが可能となる。しかしながら、混雑等ですぐに向かうことができずに見失ってしまった場合でも、その人物を検知しやすいように検知条件を変更しているため、撮影された別の画像から再度検知できる可能性が高まる。例えば、カメラ１０３によって撮影された画像から検知され目視確認された人物を見失っても、カメラ１０４によって撮影された画像から再度検知できる可能性が高まる。また、カメラ１０３による撮影範囲から外れた後に再び撮影範囲に戻ってきた場合にも、同様に検知条件が変更されているため、再度検知できる可能性が高まる。 In the detection system of the present embodiment, the person detected by the detection process is visually confirmed from the images taken by the cameras 103 and 104, and the user can immediately go to the detected person. However, even if you lose sight of the person because you cannot go immediately due to congestion, etc., the detection conditions have been changed so that the person can be easily detected, so there is a possibility that it can be detected again from another captured image. Increase. For example, even if a person who is detected and visually confirmed from the image taken by the camera 103 is lost, the possibility that the person can be detected again from the image taken by the camera 104 is increased. Further, even when the camera 103 deviates from the shooting range and then returns to the shooting range, the detection condition is changed in the same manner, so that the possibility of detection again increases.

（辞書登録処理）
次に、検知対象とする特定の人物の登録に係る処理（辞書登録処理）について説明する。以下に説明する例では、前述した構成とは別のカメラによって撮影された画像に基づいて、検知対象とする人物の情報を登録する場合について説明する。登録辞書に画像を登録する際のカメラはこれに限られるものではない。この辞書登録処理は、要求に応じて、前述した検知処理と並行するようにして動作する。図９は、第１の実施形態における辞書登録処理の例を示すフローチャートである。 (Dictionary registration process)
Next, a process related to registration of a specific person to be detected (dictionary registration process) will be described. In the example described below, a case where information on a person to be detected is registered based on an image taken by a camera different from the above-described configuration will be described. The camera for registering an image in the registration dictionary is not limited to this. This dictionary registration process operates in parallel with the above-described detection process in response to a request. FIG. 9 is a flowchart showing an example of the dictionary registration process according to the first embodiment.

辞書登録処理では、まず、ステップＳ９０１において、画像取得部３０１は、辞書登録処理の終了指示があるか否かを判定する。この辞書登録処理の終了指示は入出力装置２０６から取得する。辞書登録処理の終了指示があると画像取得部３０１が判定した場合（Ｙｅｓ）、図９に示す辞書登録処理を終了する。一方、辞書登録処理の終了指示がないと画像取得部３０１が判定した場合（Ｎｏ）、ステップＳ９０２において、画像取得部３０１は、登録する画像を取得する。前述したように、この画像は、別のカメラによって事前に撮影された画像である。 In the dictionary registration process, first, in step S901, the image acquisition unit 301 determines whether or not there is an instruction to end the dictionary registration process. The end instruction of this dictionary registration process is acquired from the input / output device 206. When the image acquisition unit 301 determines that there is an instruction to end the dictionary registration process (Yes), the dictionary registration process shown in FIG. 9 is terminated. On the other hand, when the image acquisition unit 301 determines that there is no end instruction of the dictionary registration process (No), the image acquisition unit 301 acquires the image to be registered in step S902. As mentioned above, this image is an image pre-taken by another camera.

続いて、ステップＳ９０３において、検知部３０２の検出部４０１は、ステップＳ９０２において取得した画像に対する検出処理を行い、画像から人物の顔を検出する。画像から人物の顔を検出する処理は既に説明しているため、ここでは説明を省略する。次に、ステップＳ９０４において、検知部３０２の特徴抽出部４０２は、ステップＳ９０３において検出された顔画像に対する特徴抽出処理を行い、顔画像から特徴量を抽出する。顔画像から特徴量を抽出する処理は既に説明しているため、ここでは説明を省略する。 Subsequently, in step S903, the detection unit 401 of the detection unit 302 performs detection processing on the image acquired in step S902, and detects a person's face from the image. Since the process of detecting a person's face from an image has already been described, the description thereof will be omitted here. Next, in step S904, the feature extraction unit 402 of the detection unit 302 performs a feature extraction process on the face image detected in step S903, and extracts the feature amount from the face image. Since the process of extracting the feature amount from the face image has already been described, the description thereof is omitted here.

次に、ステップＳ９０５において、検知部３０２の辞書登録部４０３は、ステップＳ９０４において抽出された特徴量に対応する人物ＩＤを、入出力装置２０６を介して取得する。次に、ステップＳ９０６において、辞書登録部４０３は、抽出された特徴量と対応する顔画像及び人物ＩＤとを紐づけて、登録辞書に記憶する。ステップＳ９０６において検知対象とする人物の情報を登録辞書に登録した後、ステップＳ９０１に戻り、辞書登録処理の終了指示があるまで、続く処理が継続される。以上が、辞書登録処理の説明である。この辞書登録処理で登録しておいた人物を前述した検知処理で検知する。 Next, in step S905, the dictionary registration unit 403 of the detection unit 302 acquires the person ID corresponding to the feature amount extracted in step S904 via the input / output device 206. Next, in step S906, the dictionary registration unit 403 associates the extracted feature amount with the corresponding face image and person ID and stores them in the registration dictionary. After registering the information of the person to be detected in step S906 in the registration dictionary, the process returns to step S901, and the subsequent processing is continued until there is an instruction to end the dictionary registration process. The above is the explanation of the dictionary registration process. The person registered in this dictionary registration process is detected by the detection process described above.

第１の実施形態によれば、検知処理による検知結果をユーザが目視確認した結果を用いて検知条件を変更することで、人の目による判断結果をシステムに反映させることができ、利便性が向上する。また、目視確認された人物が再度検知しやすくなるように検知条件を設定変更するため、別の場所のカメラで再び検知できる可能性が高まる。これにより、見失った検知対象の人物等を再度検知できる可能性が高まり、利便性が向上する。 According to the first embodiment, by changing the detection condition using the result of the user visually confirming the detection result by the detection process, the judgment result by the human eye can be reflected in the system, which is convenient. improves. In addition, since the detection condition is changed so that the visually confirmed person can be easily detected again, the possibility that the person can be detected again by the camera at another place increases. As a result, the possibility of being able to detect the lost person to be detected again is increased, and the convenience is improved.

また、本実施形態では、検知結果の目視確認を携帯端末１０５のみを用いて行うようにしているが、別の手段を用いてもよい。例えば、携帯端末には登録辞書の顔画像のみを表示し、携帯端末のユーザが検知された人物のもとに向かって行って、直接目視確認して結果を入力するようにしてもよい。ただし、混雑しているような状況では、直接目視確認するのは困難であるため、そのような場合は前述した説明のような方法をとるのが望ましい。 Further, in the present embodiment, the visual confirmation of the detection result is performed using only the mobile terminal 105, but another means may be used. For example, the mobile terminal may display only the face image of the registered dictionary, and the user of the mobile terminal may go to the detected person and directly visually confirm and input the result. However, in a crowded situation, it is difficult to visually check directly. In such a case, it is desirable to take the method as described above.

また、本実施形態では、変更した検知条件を使って、すべてのカメラから取得した画像を検知するようにしたが、特定のカメラから取得した画像に限定して検知条件を変更するようにしてもよい。例えば、検知された場所から距離的に近いカメラから取得した画像のみ、変更した検知条件を適用する。こうすることで、検知された人物が次に現れる可能性が高い場所のカメラについてのみ検知条件の設定を変更することができ、現れる可能性の低いカメラでの誤検知が増えるのを防ぐことができる。 Further, in the present embodiment, the changed detection conditions are used to detect the images acquired from all the cameras, but the detection conditions may be changed only for the images acquired from a specific camera. Good. For example, the changed detection condition is applied only to the image acquired from the camera which is close to the detected place. By doing this, it is possible to change the setting of the detection condition only for the camera in the place where the detected person is likely to appear next, and it is possible to prevent the false detection by the camera that is unlikely to appear from increasing. it can.

また、本実施形態では、図８に示したように、検知結果を検知対象の人物の画像とともに表示したが、表示方法はこれに限るものではない。特に、検知条件を変更した場合、誤検知の増加が見込まれるので、それに応じた表示方法に切り替えてもよい。例えば、図８では検知結果を一人ずつ表示しているが、検知対象の人物の候補が複数検知された場合には、例えば、その候補の画像をタイル状に並べて表示する。こうすることで、ユーザはどれが本当の検索対象の人物であるのか目視確認しやすくなる。 Further, in the present embodiment, as shown in FIG. 8, the detection result is displayed together with the image of the person to be detected, but the display method is not limited to this. In particular, when the detection conditions are changed, the number of false positives is expected to increase, so the display method may be switched accordingly. For example, in FIG. 8, the detection results are displayed one by one, but when a plurality of candidates for the person to be detected are detected, for example, the images of the candidates are displayed in tiles. This makes it easier for the user to visually confirm which is the true person to be searched.

＜第２の実施形態＞
次に、本発明の第２の実施形態について説明する。前述した第１の実施形態では、情報処理装置による検知処理での検知結果が正しいか誤りかをユーザが目視確認する例について説明した。しかし、現実的にはユーザによる目視確認において、明確に検知結果が正しい又は誤りの一方に判定できないこともあるので、検知結果が正しいか誤りかを度合で示すことができると利便性が高い。 <Second embodiment>
Next, a second embodiment of the present invention will be described. In the first embodiment described above, an example in which the user visually confirms whether the detection result in the detection process by the information processing device is correct or incorrect has been described. However, in reality, it may not be possible to clearly determine whether the detection result is correct or incorrect in the visual confirmation by the user, so it is convenient to be able to indicate whether the detection result is correct or incorrect.

第２の実施形態では、検知処理の検知結果についての目視確認の結果を正しさの度合、つまり二値ではなく複数の値（連続値や多値）で表現し、目視確認の結果に応じて検知条件を変更する例について説明する。第２の実施形態は、条件変更処理以外は前述した第１の実施形態と同様であるので、条件変更処理以外の説明を省略し、以下では条件変更処理についてのみ説明する。第２の実施形態における条件変更処理の基本的な流れは、図７に示した第１の実施形態における条件変更処理と同様であるので、図７を適宜参照して具体的な処理を説明する。 In the second embodiment, the result of visual confirmation of the detection result of the detection process is expressed by the degree of correctness, that is, a plurality of values (continuous value or multi-value) instead of binary, and according to the result of visual confirmation. An example of changing the detection condition will be described. Since the second embodiment is the same as the first embodiment described above except for the condition change process, the description other than the condition change process will be omitted, and only the condition change process will be described below. Since the basic flow of the condition change process in the second embodiment is the same as the condition change process in the first embodiment shown in FIG. 7, the specific process will be described with reference to FIG. 7 as appropriate. ..

第２の実施形態における条件変更処理では、ステップＳ７０１において、確認結果取得部３０５は、検知処理の検知結果が携帯端末１０５に通知されたか否かを判定する。検知結果が携帯端末１０５に通知されていないと確認結果取得部３０５が判定した場合（Ｎｏ）、検知結果が通知されるまで待つ。一方、検知結果が携帯端末１０５に通知されたと確認結果取得部３０５が判定した場合（Ｙｅｓ）、ステップＳ７０２において、検知結果を受けた携帯端末１０５では検知結果を表示して、携帯端末１０５のユーザに検知結果の目視確認を促す。第２の実施形態では、携帯端末１０５に表示された検知結果の正しさの度合、すなわち検知された人物が検知対象の人物であることの確かさについての目視確認をユーザに促す。携帯端末１０５は、ユーザによる目視確認の結果の入力を受け付けて情報処理装置１０１に通知する。 In the condition change process of the second embodiment, in step S701, the confirmation result acquisition unit 305 determines whether or not the detection result of the detection process has been notified to the mobile terminal 105. When the confirmation result acquisition unit 305 determines that the detection result has not been notified to the mobile terminal 105 (No), it waits until the detection result is notified. On the other hand, when the confirmation result acquisition unit 305 determines that the detection result has been notified to the mobile terminal 105 (Yes), in step S702, the mobile terminal 105 that has received the detection result displays the detection result and the user of the mobile terminal 105. Is prompted to visually confirm the detection result. In the second embodiment, the user is urged to visually confirm the degree of correctness of the detection result displayed on the mobile terminal 105, that is, the certainty that the detected person is the person to be detected. The mobile terminal 105 accepts the input of the result of the visual confirmation by the user and notifies the information processing apparatus 101.

図１０は、第２の実施形態における条件変更処理でのステップＳ７０２において、携帯端末１０５に表示される検知結果の表示例を説明する模式図である。図１０に示すように、例えば検知結果１００１、登録辞書１００２、検知された人物の顔枠１００３、スライドバー１００５、登録辞書に登録された顔画像１００６、１００７、及び目視確認結果送信ボタン１００８等が表示される。スライドバー１００５は、目視確認の結果として検知結果の正しさの度合の入力に用いられる。 FIG. 10 is a schematic diagram illustrating a display example of a detection result displayed on the mobile terminal 105 in step S702 in the condition change process in the second embodiment. As shown in FIG. 10, for example, the detection result 1001, the registration dictionary 1002, the face frame 1003 of the detected person, the slide bar 1005, the face images 1006, 1007 registered in the registration dictionary, the visual confirmation result transmission button 1008, and the like. Is displayed. The slide bar 1005 is used to input the degree of correctness of the detection result as a result of visual confirmation.

第１の実施形態では、目視確認の結果として＜正しい＞又は＜誤り＞の二値を選択したが、第２の実施形態では、スライドバー１００５により検知結果の正しさの度合を表現できるようにしている。携帯端末１０５のユーザが、スライドバー１００５で度合を選び、目視確認結果送信ボタン１００６を選択することで、携帯端末１０５は目視確認の結果を情報処理装置１０１に送信する。 In the first embodiment, the binary value of <correct> or <wrong> was selected as the result of visual confirmation, but in the second embodiment, the degree of correctness of the detection result can be expressed by the slide bar 1005. ing. When the user of the mobile terminal 105 selects the degree with the slide bar 1005 and selects the visual confirmation result transmission button 1006, the mobile terminal 105 transmits the visual confirmation result to the information processing device 101.

これにより、ステップＳ７０３において、確認結果取得部３０５は、検知結果についての目視確認の結果を携帯端末１０５から取得する。本実施形態では、検知結果の正しさを示す度合ｗは実数値をとり、例えばスライドバー１００５における左端（誤）が０．０の値を持ち、右端（正）が１．０の値を持つようにする。このようにすることで、検知結果の見え等の影響で目視確認を行うユーザが自信をもって判断できない場合でも、目視確認でのあいまいな判断を表現できる。 As a result, in step S703, the confirmation result acquisition unit 305 acquires the result of visual confirmation of the detection result from the mobile terminal 105. In the present embodiment, the degree w indicating the correctness of the detection result takes a real value, for example, the left end (wrong) of the slide bar 1005 has a value of 0.0, and the right end (positive) has a value of 1.0. To do so. By doing so, even if the user who performs the visual confirmation cannot make a judgment with confidence due to the influence of the appearance of the detection result or the like, an ambiguous judgment can be expressed by the visual confirmation.

次のステップＳ７０４において、第２の実施形態における条件変更処理では、検知条件変更部３０４は、目視確認の結果として取得した正しさの度合が所定の閾値より大きいか否かを判定する。検知条件変更部３０４は、取得した正しさの度合が所定の閾値より大きいと判定した場合（Ｙｅｓ）、目視確認の結果を正しいとしてステップＳ７０５に進む。一方、検知条件変更部３０４は、取得した正しさの度合が所定の閾値以下であると判定した場合（Ｎｏ）、目視確認の結果を誤りとして、この検知結果についての条件変更処理を終了し、新たな検知結果の通知が行われるのを待つ。 In the next step S704, in the condition change process in the second embodiment, the detection condition change unit 304 determines whether or not the degree of correctness acquired as a result of visual confirmation is larger than a predetermined threshold value. When the detection condition changing unit 304 determines that the acquired degree of correctness is larger than a predetermined threshold value (Yes), the detection condition changing unit 304 determines that the result of visual confirmation is correct and proceeds to step S705. On the other hand, when the detection condition changing unit 304 determines that the acquired correctness is equal to or less than a predetermined threshold value (No), the result of the visual confirmation is regarded as an error, and the condition changing process for the detection result is terminated. Wait for the notification of the new detection result.

ステップＳ７０５において、検知条件変更部３０４は、第１の実施形態と同様に、検知条件の設定変更の指示があるか否かを確認する。検知条件の設定変更の指示があると検知条件変更部３０４が判定した場合には（Ｙｅｓ）、処理はステップＳ７０６に進む。検知条件の設定変更の指示がないと検知条件変更部３０４が判定した場合には（Ｎｏ）、この検知結果についての条件変更処理は終了し、新たな検知結果の通知が行われるのを待つ。 In step S705, the detection condition changing unit 304 confirms whether or not there is an instruction to change the setting of the detection condition, as in the first embodiment. If the detection condition changing unit 304 determines that there is an instruction to change the setting of the detection condition (Yes), the process proceeds to step S706. If the detection condition changing unit 304 determines that there is no instruction to change the setting of the detection condition (No), the condition change processing for the detection result is completed, and a new detection result is notified.

ステップＳ７０６において、検知条件変更部３０４は、人物の検知に係る検知条件を変更する。第２の実施形態では、検知条件変更部３０４は、検知結果決定部４０５が用いる検知条件である閾値を下記の（式３）に基づいて変更する。
θ’＝θ＋ｗｒ …（式３）
（式３）において、θ’は変更後の新しい閾値であり、θは変更前の閾値であり、ｗは正しさの度合であり、ｒは変更量である。本実施形態では前述の通り、正しさの度合ｗが０．０〜１．０の値で示され、事前に定めた変更量ｒに掛け合わせた値を実際の閾値の変更量とする。このようにすることで、検知結果が正しいと自信を持って判断している場合には度合ｗは１．０に近づき、閾値の変更量は大きくなり、逆に、自信のない判断の場合には度合ｗは０．０に近づき、閾値の変更量は小さくなる。こうすることで、自信のない目視確認の結果は、検知の精度にほとんど影響を与えなくなり、誤った目視確認の結果を基に検知条件が変更されることを抑制できるため、検知システムの信頼性が増す。 In step S706, the detection condition changing unit 304 changes the detection condition related to the detection of a person. In the second embodiment, the detection condition changing unit 304 changes the threshold value, which is the detection condition used by the detection result determination unit 405, based on the following (Equation 3).
θ'= θ + wr ... (Equation 3)
In (Equation 3), θ'is the new threshold value after the change, θ is the threshold value before the change, w is the degree of correctness, and r is the amount of change. In the present embodiment, as described above, the degree of correctness w is indicated by a value of 0.0 to 1.0, and the value obtained by multiplying the predetermined change amount r is used as the actual threshold change amount. By doing so, when it is confidently judged that the detection result is correct, the degree w approaches 1.0 and the threshold change amount becomes large, and conversely, when the judgment is not confident. The degree w approaches 0.0, and the amount of change in the threshold value becomes small. By doing so, the result of unconfident visual confirmation has almost no effect on the accuracy of detection, and it is possible to suppress the change of the detection condition based on the result of erroneous visual confirmation, so that the reliability of the detection system can be suppressed. Will increase.

検知条件の変更の方法は、これに限定されるものではない。目視確認の結果に自信がなくても、可能性がある人物はとにかく再度検知したい、というユースケースも考えられる。そのような場合には、本実施形態とは逆に、目視確認の結果に自信がない場合に大きな値を補正して、再度検知しやすくするようにしてもよい。 The method of changing the detection condition is not limited to this. Even if you are not confident in the result of the visual confirmation, there may be a use case where you want to detect a potential person again anyway. In such a case, contrary to the present embodiment, if the result of the visual confirmation is not confident, a large value may be corrected to make it easier to detect again.

第２の実施形態によれば、目視確認の結果を正しい又は誤りの二値ではなく、正しさの度合で表現することで、より柔軟性の高いシステムを実現できる。 According to the second embodiment, a more flexible system can be realized by expressing the result of visual confirmation not by the binary value of correctness or error but by the degree of correctness.

＜その他の実施形態＞
目視確認に使う画像の表示方法は前述の方法以外にも様々考えられる。図１１の上部にはカメラ１０３、１０４によって撮影された画像シーケンス（顔画像のみ拡大）を示している。撮影された時間順にフレーム１１０１〜１１０４が示されている。前述の実施形態では、このうち１フレームのみを携帯端末１０５に表示していたが、検知（追尾）した画像シーケンスの全フレームを表示するようにしてもよい。このように複数フレームを表示することで、前後の異なる見え方の画像をもとにユーザに目視確認させることができ、利便性が高まる。 <Other Embodiments>
Various methods other than the above-mentioned methods can be considered for displaying the image used for visual confirmation. The upper part of FIG. 11 shows an image sequence (enlarged only of the face image) taken by the cameras 103 and 104. Frames 1101-1104 are shown in order of time taken. In the above-described embodiment, only one frame is displayed on the mobile terminal 105, but all frames of the detected (tracked) image sequence may be displayed. By displaying a plurality of frames in this way, the user can visually confirm the images having different appearances before and after, which enhances convenience.

検知条件の変更方法は、前述した方法以外にも様々考えられる。前述の実施形態では、閾値を事前に定めた定数で変更したが、動的に変更してもよい。図１１の下部には、上部の監視カメラ画像シーケンスに対応する、ある人物の類似度の推移を実線１１０５で示している。左から３フレーム目までは類似度が閾値を越えており、最後のフレームでは類似度が閾値未満となっている。目視確認の結果、この人物の検知結果が正しいと判定された場合、この情報を基に閾値を変更する。例えば、この中の最低の類似度（最後のフレーム）を閾値とすれば、この人物を再度検知できる可能性が高まる。この他、全フレームの類似度の平均や中央値、最大値に所定の割合を掛ける、等の方法が考えられる。いずれの方法にしても、再度検知できる可能性が上がるメリットと誤検知が増えるデメリットとのトレードオフとなるため、ユースケースに応じて選択することが望ましい。 Various methods other than the above-mentioned methods can be considered for changing the detection conditions. In the above-described embodiment, the threshold value is changed by a predetermined constant, but it may be changed dynamically. At the bottom of FIG. 11, the transition of the similarity of a person corresponding to the surveillance camera image sequence at the top is shown by the solid line 1105. The similarity exceeds the threshold from the left to the third frame, and the similarity is less than the threshold in the last frame. As a result of visual confirmation, if it is determined that the detection result of this person is correct, the threshold value is changed based on this information. For example, if the lowest similarity (last frame) among them is set as a threshold value, the possibility that this person can be detected again increases. In addition, a method such as multiplying the average, median, or maximum value of the similarity of all frames by a predetermined ratio can be considered. Either method is a trade-off between the merit of increasing the possibility of re-detection and the demerit of increasing false positives, so it is desirable to select it according to the use case.

また、登録辞書を検知条件として、これを変更してもよい。すなわち、第１の実施形態では、検知条件変更部３０４と辞書登録部４０３とは別々の構成となっているが、これをまとめて、検知条件変更部が辞書登録部の機能を持つようにする。検知対象の人物を検知した際のカメラの画像を登録辞書に追加登録すれば、他のカメラでその人物を検知できる可能性が高まる。例えば、図１１のフレーム１１０２を見てユーザが目視確認し、検知結果が正しいと判定した場合、フレーム１１０２の画像をその人物の登録辞書に追加登録する。この他、フレーム１１０１〜１１０４のすべての画像を追加登録するようにしてもよいし、時間的な情報、すなわち、画像が撮影された時間を基準に選択して登録するようにしてもよい。例えば、目視確認した画像に連続する画像（フレーム１１０１、１１０２）だけを追加登録する等が考えられる。 Further, the registered dictionary may be used as a detection condition and this may be changed. That is, in the first embodiment, the detection condition changing unit 304 and the dictionary registration unit 403 have separate configurations, but these are put together so that the detection condition changing unit has the function of the dictionary registration unit. .. If the image of the camera when the person to be detected is additionally registered in the registration dictionary, the possibility that the person can be detected by another camera increases. For example, when the user visually confirms by looking at the frame 1102 of FIG. 11 and determines that the detection result is correct, the image of the frame 1102 is additionally registered in the registration dictionary of the person. In addition, all the images of frames 1101 to 1104 may be additionally registered, or may be selected and registered based on temporal information, that is, the time when the images were taken. For example, it is conceivable to additionally register only continuous images (frames 1101 and 1102) to the visually confirmed image.

また、類似度の算出に用いるパラメタを変更するようにしてもよい。本実施形態において一例として示したコサイン類似度であれば、角度を計算する際の原点をカメラで撮影した画像を基に計算するようにすれば、撮影現場の特徴量空間で比較できるのでより精度よく検知できる。ただし、カメラ毎に類似度の計算方法が異なると、そのままでは比較できなくなるため、比較するための正規化パラメタ等を計算しておく必要がある。また、本実施形態では顔画像を基に人物を検知していたが、目視確認できた人物だけ他の情報を基に検知するようにしてもよい。例えば、目視確認できた人物の服装、性別といった属性情報で検知すれば、誤検知は増えるものの、再度検知できる可能性はより高まる。
なお、前述した説明では、検知対象の人物を検知する例について説明したが、検知対象の被写体は人物に限られるものではない。検知対象の被写体は、例えば車や特定の物体等であってもよい。 Further, the parameters used for calculating the similarity may be changed. With the cosine similarity shown as an example in this embodiment, if the origin when calculating the angle is calculated based on the image taken by the camera, it can be compared in the feature space of the shooting site, so that the accuracy is higher. It can be detected well. However, if the calculation method of the similarity is different for each camera, the comparison cannot be performed as it is, so it is necessary to calculate the normalization parameters and the like for comparison. Further, in the present embodiment, the person is detected based on the face image, but only the person who can be visually confirmed may be detected based on other information. For example, if it is detected by the attribute information such as the clothes and gender of the person who can be visually confirmed, the false detection increases, but the possibility of detecting it again increases.
In the above description, an example of detecting a person to be detected has been described, but the subject to be detected is not limited to the person. The subject to be detected may be, for example, a car or a specific object.

なお、前述した各処理部のうち、特徴抽出部４０２等については、人工知能（ＡＩ：ａｒｔｉｆｉｃｉａｌｉｎｔｅｌｌｉｇｅｎｃｅ）を適用しても構わない。例えば、前記処理部の代わりとして、機械学習された学習済みモデルを代わりに用いても良い。その場合には、その処理部への入力データと出力データとの組合せを学習データとして複数個準備し、それらから機械学習によって知識を獲得し、獲得した知識に基づいて入力データに対する出力データを結果として出力する学習済みモデルを生成する。この学習済みモデルは、例えばニューラルネットワークモデルで構成できる。そして、その学習済みモデルは、前記処理部と同等の処理をするためのプログラムとして、ＣＰＵあるいはＧＰＵなどと協働で動作することにより、前記処理部の処理を行う。また、前記学習済みモデルは、必要に応じて一定のデータを処理する毎に更新する等も可能である。 Of the above-mentioned processing units, artificial intelligence (AI) may be applied to the feature extraction unit 402 and the like. For example, a machine-learned trained model may be used instead of the processing unit. In that case, a plurality of combinations of input data and output data to the processing unit are prepared as learning data, knowledge is acquired from them by machine learning, and output data for the input data is obtained as a result based on the acquired knowledge. Generate a trained model that outputs as. This trained model can be constructed, for example, by a neural network model. Then, the trained model performs the processing of the processing unit by operating in collaboration with the CPU, GPU, or the like as a program for performing the same processing as the processing unit. Further, the trained model can be updated every time a certain amount of data is processed, as needed.

本発明は、前述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

なお、前記実施形態は、何れも本発明を実施するにあたっての具体化のほんの一例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 It should be noted that the above-described embodiments are merely examples of embodiment of the present invention, and the technical scope of the present invention should not be construed in a limited manner by these. That is, the present invention can be implemented in various forms without departing from the technical idea or its main features.

１０１：情報処理装置１０２：ネットワーク１０３、１０４：監視カメラ１０５：携帯端末３０１：画像取得部３０２：検知部３０３：通知部３０４：検知条件変更部３０５：確認結果取得部４０１：検出部４０２：特徴抽出部４０３：辞書登録部４０４：類似度算出部４０５：検知結果決定部４０６：検知条件取得部 101: Information processing device 102: Network 103, 104: Surveillance camera 105: Mobile terminal 301: Image acquisition unit 302: Detection unit 303: Notification unit 304: Detection condition change unit 305: Confirmation result acquisition unit 401: Detection unit 402: Features Extraction unit 403: Dictionary registration unit 404: Similarity calculation unit 405: Detection result determination unit 406: Detection condition acquisition unit

Claims

Image acquisition means to acquire images and
A detection means for detecting a subject to be detected from the image and
Confirmation result acquisition means for acquiring the result of visual confirmation by the user of the detection result by the detection means, and
Information characterized in that the detection means has a detection condition changing means for changing the detection condition when detecting the subject to be detected based on the result of the visual confirmation acquired by the confirmation result acquisition means. Processing equipment.

When the result of the visual confirmation acquired by the confirmation result acquisition means indicates that the detection result by the detection means is correct, the detection condition changing means makes it easier to detect the subject to be detected. The information processing apparatus according to claim 1, wherein the information processing apparatus is changed.

The information processing device according to claim 1 or 2, wherein the detection condition changing means changes only the detection condition of the visually confirmed subject among the registered subjects to be detected.

The information processing apparatus according to any one of claims 1 to 3, wherein the detection condition changing means changes the detection condition in response to an instruction to change the setting of the detection condition.

The detection means
A dictionary registration means for registering the subject to be detected in the registration dictionary, and
A detection means for detecting a predetermined object from the image, and
A similarity calculation means for calculating the similarity between the registered dictionary and the object detected by the detection means, and a similarity calculation means.
It has a detection result determining means for determining a detection result based on the similarity and a threshold value calculated by the similarity calculating means.
The information processing apparatus according to any one of claims 1 to 4, wherein the detection condition changing means changes the threshold value based on the result of the visual confirmation.

The detection means has a feature extraction means for extracting a feature amount of the object.
The information processing according to claim 5, wherein the similarity calculation means calculates the similarity between the feature amount registered in the registration dictionary and the feature amount of the object detected by the detection means. apparatus.

The information according to any one of claims 1 to 6, wherein the result of the visual confirmation acquired by the confirmation result acquisition means is information indicating whether or not the detection result by the detection means is correct. Processing equipment.

The information according to any one of claims 1 to 6, wherein the result of the visual confirmation acquired by the confirmation result acquisition means is information indicating the degree of correctness of the detection result by the detection means. Processing equipment.

The information processing device according to any one of claims 1 to 8, further comprising a notification means for notifying the detection result by the detection means.

The information processing device according to claim 9, wherein the notification means displays an image in which the detection means detects a subject to be detected together with a detection result by the detection means.

The notification means is characterized in that, when the detection condition changing means changes the detection condition, a plurality of images obtained by the detection means detecting the subject to be detected are displayed together with the detection result by the detection means. The information processing apparatus according to claim 9.

The information processing device according to any one of claims 1 to 11, further comprising a display means for displaying the detection result by the detection means.

The information processing device according to any one of claims 1 to 12, wherein the subject to be detected is a specific person.

With one or more cameras
An information processing system including the information processing device according to any one of claims 1 to 13, wherein the image acquisition means acquires the image taken by the camera.

The information processing system according to claim 14, further comprising a terminal that receives and displays the detection result by the detection means from the information processing device and transmits the result of visual confirmation by the user to the information processing device. ..

When the result of the visual confirmation acquired by the confirmation result acquisition means indicates that the detection result by the detection means is correct, the claim is characterized in that an image of the subject of the detection result is registered in the registration dictionary. The information processing apparatus according to 5.

Image acquisition process to acquire images and
A detection process that detects a subject to be detected from the image, and
A confirmation result acquisition process for acquiring the result of visual confirmation by the user of the detection result in the detection process, and
Based on the result of the visual confirmation acquired in the confirmation result acquisition step, the detection step is characterized by having a detection condition changing step of changing the detection condition when detecting the subject to be detected. Information processing method to do.

Image acquisition steps to acquire images and
A detection step that detects the subject to be detected from the image, and
A confirmation result acquisition step for acquiring the result of visual confirmation by the user of the detection result in the detection step, and
In order to cause the computer to execute a detection condition change step of changing the detection condition when detecting the subject to be detected in the detection step based on the result of the visual confirmation acquired in the confirmation result acquisition step. Program.