JP5982557B2

JP5982557B2 - Video surveillance system and image search system

Info

Publication number: JP5982557B2
Application number: JP2015507829A
Authority: JP
Inventors: 智明吉永; 健一米司; 廣池　敦; 敦廣池; 裕樹渡邉; 佑人小松; 影山　昌広; 昌広影山
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2013-03-29
Filing date: 2013-03-29
Publication date: 2016-08-31
Anticipated expiration: 2033-03-29
Also published as: JPWO2014155639A1; WO2014155639A1

Description

本発明は、映像監視システムに係り、特に、カメラで取得した映像に対して顔・人検出などの検出処理する技術に関する。 The present invention relates to a video surveillance system, and more particularly to a technique for performing detection processing such as face / person detection on video acquired by a camera.

従来の大規模な監視システムでは、様々な場所に設置された監視カメラの映像から、顔や人、車両などの特定のオブジェクトを検出し、上記検出したオブジェクトから特徴量を抽出してデータベースに登録・管理を行っていた。これにより、蓄積されたオブジェクト特徴を用いて特定人物を検索することや、特定人物の移動軌跡などを瞬時に分析することができた。 In conventional large-scale surveillance systems, specific objects such as faces, people, and vehicles are detected from the images of surveillance cameras installed in various locations, and feature quantities are extracted from the detected objects and registered in the database.・ We were managing. As a result, it was possible to search for a specific person using the accumulated object features and to instantly analyze the movement trajectory of the specific person.

しかしながら従来システムでは、システムに接続されたネットワーク上の複数の監視カメラそれぞれで照明環境や俯角などが大きく異なるため、設置状況によって認識精度が劣化するという課題がある。例えば俯角が大きく設置されたカメラや、周囲の照明変動が激しい位置に設置されたカメラの映像に対しては顔検出精度が劣化するということがある。また、カメラ間で映る人物の顔向きが異なる場合、あるカメラ１で撮影された顔の特徴量を基に、カメラ１と俯角の異なるカメラ２から顔検索を行っても顔向きが異なることで同一人物の顔が見つかりにくくなるということが生じる。 However, the conventional system has a problem in that the recognition accuracy deteriorates depending on the installation situation because the illumination environment, the depression angle, and the like are greatly different for each of the plurality of monitoring cameras on the network connected to the system. For example, the face detection accuracy may be deteriorated for an image of a camera installed at a large depression angle or a camera installed at a position where the surrounding illumination fluctuation is severe. In addition, when the face orientation of a person reflected between the cameras is different, the face orientation is different even if a face search is performed from the camera 2 having a different depression angle based on the feature amount of the face photographed by a certain camera 1. It may be difficult to find the face of the same person.

これに対して、特許文献１は使用環境の温度によって、事前に用意したテーブルを基に画像処理方法を選択することで、照明変動を安定化する方法を提案している。同様に、特許文献２は、２つのカメラを使って顔を撮影し、それぞれの画像から得られた顔の状況に応じて顔認証の閾値を切替えることで安定化を図っている。 On the other hand, Patent Document 1 proposes a method of stabilizing illumination variation by selecting an image processing method based on a table prepared in advance according to the temperature of the use environment. Similarly, Patent Document 2 aims at stabilization by photographing a face using two cameras and switching a face authentication threshold according to the face situation obtained from each image.

特開２０１２−１３４８７５号公報JP 2012-134875 A 特開２００８−１０８２４３号公報JP 2008-108243 A

Ｐ．ＶｉｏｌａａｎｄＭ．Ｊｏｎｅｓ，「ＲａｐｉｄＯｂｊｅｃｔＤｅｔｅｃｔｉｏｎｕｓｉｎｇＢｏｏｓｔｅｄｃａｓｃａｄｅｏｆＳｉｍｐｌｅＦｅａｔｕｒｅｓ」，ＣＶＰＲ２００１P. Viola and M.M. Jones, “Rapid Object Detection using Boosted Cascade of Simple Features”, CVPR2001

しかしながら、特許文献１では、温度のように別の手段を用いて計測しなければならずシステムが複雑化するという課題があった。また、特許文献２では、カメラを同位置に２台設置しなければならず、コストが増大していた。一方で、カメラ毎に異なる認識処理を人手で設定することも考えらえるが、これには専門知識が必要であり、カメラ台数が数１００台以上など膨大になると、この設定に多くの時間が必要であった。 However, in Patent Document 1, there is a problem that the system is complicated because it must be measured using another means such as temperature. In Patent Document 2, two cameras must be installed at the same position, which increases the cost. On the other hand, it may be possible to manually set different recognition processes for each camera, but this requires specialized knowledge. If the number of cameras is more than a few hundred or so, this setting takes a lot of time. It was necessary.

上記課題を解決するために、例えば請求の範囲に記載の構成を採用する。本願は上記課題を解決する手段を複数含んでいるが、その一例を挙げるならば、監視システムであって、カメラと、カメラで撮影された入力画像から対象物を検出する検出部と、検出された対象物の特徴量を抽出する特徴量抽出部と、入力画像と対象物と特徴量とを蓄積する記憶部と、検出部と特徴量抽出部とを制御する選択制御部と、検出部と特徴量抽出部とからの結果を評価する評価部と、を有し、評価部からの出力に基づいて選択制御部は、検出部での検出方式および特徴量抽出部での抽出方式を選択することを特徴とする。 In order to solve the above problems, for example, the configuration described in the claims is adopted. The present application includes a plurality of means for solving the above-described problems. To give an example, a monitoring system includes a camera and a detection unit that detects an object from an input image captured by the camera. A feature amount extraction unit that extracts a feature amount of the target object, a storage unit that accumulates the input image, the target object, and the feature amount, a selection control unit that controls the detection unit and the feature amount extraction unit, a detection unit, An evaluation unit that evaluates a result from the feature amount extraction unit, and the selection control unit selects a detection method in the detection unit and an extraction method in the feature amount extraction unit based on the output from the evaluation unit It is characterized by that.

本発明によれば、複数の監視カメラ毎に異なる状況下において、カメラ毎に最適な検出・抽出処理を自動的に選択できる。 According to the present invention, an optimum detection / extraction process can be automatically selected for each camera under different circumstances for each of a plurality of surveillance cameras.

本発明の実施例１の概要を示す図である。It is a figure which shows the outline | summary of Example 1 of this invention. 本発明のシステム構成を示す図である。It is a figure which shows the system configuration | structure of this invention. オブジェクト検出部における検出処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the detection process in an object detection part. オブジェクト検出部における検出方式の概要を示す図である。It is a figure which shows the outline | summary of the detection system in an object detection part. 特徴抽出部における特徴抽出処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the feature extraction process in a feature extraction part. 認識結果出力部において出力する認識結果の一例を示す図である。It is a figure which shows an example of the recognition result output in a recognition result output part. 評価部の構成を示す図である。It is a figure which shows the structure of an evaluation part. 画像収集部が行う評価画像の収集の流れを示すフローチャートである。It is a flowchart which shows the flow of collection of the evaluation image which an image collection part performs. 結果判定部における認識結果判定の概要を示す図である。It is a figure which shows the outline | summary of the recognition result determination in a result determination part. 統計解析部における統計解析結果の一例を示す図である。It is a figure which shows an example of the statistical analysis result in a statistical analysis part. 選択制御部における方式選択結果の一例を示す図である。It is a figure which shows an example of the system selection result in a selection control part. 表示部における認識部の管理画面の一例を示す図である。It is a figure which shows an example of the management screen of the recognition part in a display part. 表示部における画像検索を行った結果を表示する画面の一例を示す図である。It is a figure which shows an example of the screen which displays the result of having performed the image search in a display part. 統計解析部における統計解析を検出したオブジェクトの画像上の位置とサイズに関して行った結果の一例を示す図である。It is a figure which shows an example of the result performed regarding the position and size on the image of the object which detected the statistical analysis in a statistical analysis part. 本発明の実施例３における特殊条件時方式制御テーブルの一例を示すものである。It shows an example of a special condition time system control table in Embodiment 3 of the present invention.

図１は本発明の概要を示すものであり、本発明を用いた監視システムの一例を示す図である。本システムは、１または複数台のカメラ１１０ａ〜１１０ｂを有し、これらのカメラは通信基盤１２０を通じて繋がっている。通信基盤１２０は、ＬＡＮや映像伝送ケーブルであり、各カメラの映像は通信基盤１２０を通じて伝送される。画像認識部１３０はサーバ上またはカメラ内のＣＰＵ上に置かれる処理部であり、カメラ１１０から得られる画像に対して顔検出などの認識処理を実施し、認識結果を通信基盤１２０を通じて特徴量データベース１４０に伝送する。 FIG. 1 shows an outline of the present invention and is a diagram showing an example of a monitoring system using the present invention. This system has one or a plurality of cameras 110 a to 110 b, and these cameras are connected through a communication infrastructure 120. The communication infrastructure 120 is a LAN or a video transmission cable, and the video of each camera is transmitted through the communication infrastructure 120. The image recognition unit 130 is a processing unit placed on a server or a CPU in the camera, and performs a recognition process such as face detection on an image obtained from the camera 110, and the recognition result is stored in a feature amount database through the communication infrastructure 120. 140.

特徴量データベース１４０は、画像認識部１３０から得られた特徴量を格納し、管理する機能を有する。また、蓄積した複数の特徴量から類似する特徴量のデータを検索する機能を有する。特徴量データベース１４０に蓄積されている画像特徴量には、画像データベース１５０に蓄積されている画像とＩＤなどを用いて紐付けされている。画像データベース１５０は、各カメラ１１０から得られる画像データを蓄積する。 The feature quantity database 140 has a function of storing and managing feature quantities obtained from the image recognition unit 130. Further, it has a function of searching for similar feature value data from a plurality of accumulated feature values. The image feature amount stored in the feature amount database 140 is associated with an image stored in the image database 150 using an ID or the like. The image database 150 stores image data obtained from each camera 110.

ここで蓄積される画像データには、時間情報やカメラ情報などが付加されており、これらの情報を基に所定の画像を瞬時に検索する機能を有する。管理部１６０は、選択制御部で設定されている制御方式の統括など、画像認識部１３０を管理する機能を有する。表示部１７０は、画像データベースに蓄積された画像データなどを検索・閲覧することや、管理部１６０を通じて認識部１３０の状態を管理する。センサ１８０は、赤外線センサやＲＦＩＤリーダなどの各種センサであり、センサ情報は通信基盤１２０を通じて画像認識部１３０でも取得できるようになっている。ただし、センサ１８０は必須の構成ではない。 The image data stored here is added with time information, camera information, and the like, and has a function of instantaneously searching for a predetermined image based on such information. The management unit 160 has a function of managing the image recognition unit 130 such as control of the control method set by the selection control unit. The display unit 170 searches and browses image data and the like stored in the image database, and manages the state of the recognition unit 130 through the management unit 160. The sensor 180 is a variety of sensors such as an infrared sensor and an RFID reader, and sensor information can be acquired by the image recognition unit 130 through the communication infrastructure 120. However, the sensor 180 is not an essential configuration.

画像認識部１３０では、カメラ１１０から得られた画像に対して、まず選択制御部１３１で、オブジェクト検出部１３２、特徴抽出部１３３で行う画像認識方式を決定する。オブジェクト検出部１３２は、複数の検出方式１〜Ｎを有しており、選択制御部１３１で選択された方式に従って、オブジェクト検出を行う。同様に、特徴抽出部１３３においても、選択制御部１３１で決められた抽出方式を用いて１３２で得られたオブジェクトから、そのオブジェクトの特徴量抽出を行う。 In the image recognition unit 130, the image recognition method performed by the object detection unit 132 and the feature extraction unit 133 is first determined by the selection control unit 131 for the image obtained from the camera 110. The object detection unit 132 has a plurality of detection methods 1 to N, and performs object detection according to the method selected by the selection control unit 131. Similarly, the feature extraction unit 133 also extracts the feature amount of the object from the object obtained at 132 using the extraction method determined by the selection control unit 131.

認識結果出力部１３４では、得られたオブジェクト特徴量の情報を通信基盤１２０経由で特徴量データベース１４０に送付する。また、状況に応じて評価部１３５に送付する。評価部１３５では、オブジェクト検出部１３２と特徴抽出部１３３の有する各種方式の評価を実施し、評価結果を統計解析部１３６に送付する。統計解析部１３６では、得られた複数の評価結果を統計解析することで、カメラ毎、または各カメラの時間帯毎に最適なオブジェクト検出方式、特徴抽出方式が何であるかを判定し、方式選択テーブルを作成および定期的に更新する。選択制御部１３１では、統計解析部１３６の有する方式選択テーブルに従って、方式選択を行う。以上の構成をとることで、カメラ毎に事前に決められた最適な認識方式を用いてオブジェクト検出や特徴抽出処理を行うことができ、設置状況に応じた最適な認識処理が自動的に行える。 The recognition result output unit 134 sends the obtained object feature information to the feature database 140 via the communication infrastructure 120. Moreover, it sends to the evaluation part 135 according to a condition. The evaluation unit 135 evaluates various methods of the object detection unit 132 and the feature extraction unit 133 and sends the evaluation results to the statistical analysis unit 136. The statistical analysis unit 136 statistically analyzes a plurality of obtained evaluation results to determine the optimum object detection method and feature extraction method for each camera or for each camera time zone, and to select a method. Create and update tables regularly. The selection control unit 131 performs method selection according to the method selection table of the statistical analysis unit 136. With the above configuration, object detection and feature extraction processing can be performed using an optimal recognition method determined in advance for each camera, and optimal recognition processing according to installation conditions can be automatically performed.

図２は、本発明のシステム構成を示す図である。画像認識部１３０は、サーバ計算機２１０上で動作するソフトウェアである。サーバ計算機２１０は、情報を通信基盤に対して送受信するＩ／Ｆ２１１と、画像認識部１３０の処理を行うＣＰＵ２１２、メモリ２１３、情報記憶を行うＨＤＤ２１４で構成される。また、管理部１６０も同様にＣＰＵ２１２上で実行され、その管理情報はＨＤＤ２１４内に保管される。表示部１７０はクライアント計算機内で実現される。上記、システム構成上で、本発明は実行される。なお、サーバ計算機は１台でも複数台あってもよい。また、クライアント計算機１７０と同一機器上で実現してもよい。 FIG. 2 is a diagram showing a system configuration of the present invention. The image recognition unit 130 is software that operates on the server computer 210. The server computer 210 includes an I / F 211 that transmits / receives information to / from a communication base, a CPU 212 that performs processing of the image recognition unit 130, a memory 213, and an HDD 214 that stores information. Similarly, the management unit 160 is executed on the CPU 212, and the management information is stored in the HDD 214. The display unit 170 is realized in the client computer. The present invention is implemented on the above system configuration. One server computer or a plurality of server computers may be provided. The client computer 170 may be realized on the same device.

図３は、オブジェクト検出部１３２の処理の流れの一例を示す図である。ここでは、オブジェクトとして顔検出を行う例を示しているが、検出したい対象に応じて人や車両検出を行っても良い。Ｓ３１で得られた画像に対して、Ｓ３２で前処理を行う。例えば、画像スケール変更、超解像、平滑化や輪郭強調などのフィルタ処理などが候補となる。次にＳ３３では動体検出処理を行い、動体領域を抽出する。Ｓ３４では、Ｓ３３で得られた動体領域に対して顔検出を行い、顔を検出する。顔検出は例えば非特許文献１に記載された方式を用いることで事前に機械学習により構築した識別器を用いることで画像から顔を検出できる。 FIG. 3 is a diagram illustrating an example of a processing flow of the object detection unit 132. Here, an example in which face detection is performed as an object is shown, but human or vehicle detection may be performed according to a target to be detected. Preprocessing is performed on the image obtained in S31 in S32. For example, filter processing such as image scale change, super-resolution, smoothing, and edge enhancement are candidates. In step S33, a moving object detection process is performed to extract a moving object region. In S34, face detection is performed on the moving object region obtained in S33 to detect a face. Face detection can detect a face from an image by using a discriminator constructed by machine learning in advance by using a method described in Non-Patent Document 1, for example.

この識別器の例を図４に示す。正面顔を検出可能な正面顔識別器、左横顔を検出可能な横顔識別器、下向き顔を検出可能な下向き顔識別器など、事前に顔向き毎に識別器を構築しておく。顔の回転ごとに識別器を用意しておくことや、人・車両などオブジェクトごとに識別器を用意しておいても良い。実行時は、これら複数の識別器の中から選択制御部１３１で選択された識別器を使って検出処理を行う。Ｓ３５では、Ｓ３４で得られた検出結果に対して誤検出判定を行って、Ｓ３６で検出結果を出力する。各ステップにおける処理の実行／未実行、複数方式の選択は、選択制御部１３１で決定されている結果に基づいて行われる。 An example of this discriminator is shown in FIG. A discriminator is constructed in advance for each face direction, such as a front face discriminator capable of detecting a front face, a side face discriminator capable of detecting a left side face, and a downward face discriminator capable of detecting a downward face. A classifier may be prepared for each rotation of the face, or a classifier may be prepared for each object such as a person or a vehicle. At the time of execution, detection processing is performed using a discriminator selected by the selection control unit 131 from among the plurality of discriminators. In S35, an erroneous detection determination is performed on the detection result obtained in S34, and the detection result is output in S36. Execution / non-execution of processing in each step and selection of a plurality of methods are performed based on the result determined by the selection control unit 131.

図５は、特徴抽出部におけるオブジェクト特徴量抽出処理の流れの一例を示す図である。Ｓ５１では、オブジェクト検出部１３２で得られたオブジェクト検出結果を受け取る。Ｓ５２では、検出したすべてのオブジェクトからオブジェクトｏを選択して、以下のＳ５３、Ｓ５４の処理を行う。Ｓ５３では、検出した顔画像に対して前処理を行う。前処理は、照明の正規化や、超解像による画像拡大、目鼻口などの顔器官位置検出による画像位置補正などを行う。 FIG. 5 is a diagram illustrating an example of the flow of object feature amount extraction processing in the feature extraction unit. In S51, the object detection result obtained by the object detection unit 132 is received. In S52, the object o is selected from all the detected objects, and the following processes of S53 and S54 are performed. In S53, pre-processing is performed on the detected face image. Pre-processing includes normalization of illumination, image enlargement by super-resolution, and image position correction by detecting the position of a facial organ such as the eyes and nose and mouth.

次に、Ｓ５４において前記前処理がおこなわれた顔画像に対して特徴量抽出を行う。特徴量抽出では、ＰＣＡ（主成分分析）やＧａｂｏｒＦｉｌｔｅｒ、ＨＯＧ特徴量算出などによって複数次元Ｄの画像特徴量を抽出する。この特徴量抽出として、頭部全体からＤ１次元の特徴量を抽出する手法や、髪のある領域を除いた顔領域部分のみからＤ２次元の特徴量を抽出する手法や、Ｓ５４の前処理で得られた顔器官位置周辺から局所的な特徴量をＤ３次元抽出する手法などを用意しておき、これを選択制御部１３１の指定において切替える。ここでＤ１、Ｄ２、Ｄ３＜Ｄである。 Next, feature amount extraction is performed on the face image on which the preprocessing has been performed in S54. In the feature amount extraction, multi-dimensional D image feature amounts are extracted by PCA (principal component analysis), Gabor Filter, HOG feature amount calculation, or the like. As this feature amount extraction, a method for extracting D1-dimensional feature values from the entire head, a method for extracting D2-dimensional feature values only from the face region excluding the region with hair, and the pre-processing of S54 are used. For example, a technique for extracting a local feature amount from the periphery of the given facial organ position in a three-dimensional manner is prepared, and this is switched in the designation of the selection control unit 131. Here, D1, D2, and D3 <D.

Ｓ５５では、検出した全てのオブジェクトに対して特徴量を抽出したかを判定し、未抽出のオブジェクトがあればＳＥ２に戻り、なければＳ５６へと移動する。Ｓ５６では得られた全オブジェクトの特徴量を出力する。これによって、例えば顔がカメラに小さく映る傾向にある場合は頭部全体のＤ１次元特徴量、大きく映る場合は、全体Ｄ１次元＋顔部分Ｄ２次元特徴量を抽出するという選択が行われる。このように複数の特徴量を抽出することで、顔が小さい映るカメラと大きく映るカメラ間では、Ｄ１次元の頭部全体特徴量のみを用いて検索を行い、顔が大きく映るカメラ間では、全ての特徴量を用いて検索を行うことが可能となる。 In S55, it is determined whether or not feature amounts have been extracted for all detected objects. If there is an unextracted object, the process returns to SE2, and if not, the process moves to S56. In S56, the obtained feature quantities of all objects are output. Accordingly, for example, when the face tends to appear small on the camera, the D1 dimensional feature value of the entire head is selected, and when the face appears large, selection is performed such that the overall D1 dimension + face part D2 dimensional feature value is extracted. By extracting a plurality of feature quantities in this way, a search is performed using only the D1-dimensional whole head feature quantity between a camera with a small face and a camera with a large face. It is possible to perform a search using the feature amount.

図６は認識結果出力部１３４の出力結果の例を示す図である。認識結果出力部１３４では、オブジェクト検出部１３２、特徴抽出部１３３で得られたオブジェクト情報にカメラ番号や時間情報を付加して認識結果とし、特徴量データベース１４０に登録する。認識結果が要する情報は、撮影されたカメラ番号や、撮影時間などで固有なフレームＩＤ、選択制御部１３１で選択された検出処理方式と特徴抽出方式、画像上のオブジェクト位置を示す検出領域、そのオブジェクトから得られたＤ次元の特徴量などである。 FIG. 6 is a diagram illustrating an example of an output result of the recognition result output unit 134. The recognition result output unit 134 adds the camera number and time information to the object information obtained by the object detection unit 132 and the feature extraction unit 133 to obtain a recognition result and registers it in the feature amount database 140. The information required for the recognition result includes the photographed camera number, the frame ID unique to the photographing time, the detection processing method and feature extraction method selected by the selection control unit 131, the detection region indicating the object position on the image, D-dimensional feature value obtained from the object.

得られた認識結果情報は認識結果テーブル６１０として蓄積され、これを画像フレームごとに送っても良いし、一定時間ごとにまとめて送ってもよい。検出方式と特徴抽出方式の情報を付加しておくことで、特徴量データベースにおいて、これらの情報別にファイル管理して記録しておくことができる。これにより、データ検索の速度を高められる。また、同方式の結果得られた特徴量同士内で検索したり、別方式間は所定の変換を行って検索することも可能となり、検索精度を高められる。 The obtained recognition result information is accumulated as a recognition result table 610, which may be sent for each image frame, or may be sent at regular intervals. By adding information on the detection method and the feature extraction method, it is possible to manage and record the files separately for these pieces of information in the feature amount database. Thereby, the speed of data search can be increased. In addition, it is possible to search within the feature quantities obtained as a result of the same method, or to perform a search by performing a predetermined conversion between different methods, thereby improving the search accuracy.

図７は、評価部１３５の詳細な構成を示した図である。画像収集部７０１では、評価画像を伝送部１２０から取得し、評価用データ蓄積部７０３に蓄積する。ただし、評価用データ蓄積部７０３への蓄積は、ある特定のセンサ情報と認識結果を取得した際のみ行われる。センサ情報取得部７０２では伝送部からセンサ情報を取得し、特定のセンサ情報を取得した際には画像収集部に通知する。このセンサ情報は、例えば人感センサや、ＲＦＩＤなどを用いた入退カード情報、携帯電話などによる電波情報などである。入退時のカード情報などを用いれば、何人がどこからカメラ撮影エリアに入ったかまで大体推定可能となる。評価データ蓄積部７０３では、画像収集部７０１で収集した画像を蓄積する。この際に、センサ情報や認識結果情報を付加して記録しておく。 FIG. 7 is a diagram showing a detailed configuration of the evaluation unit 135. The image collection unit 701 acquires an evaluation image from the transmission unit 120 and stores it in the evaluation data storage unit 703. However, accumulation in the evaluation data accumulation unit 703 is performed only when specific sensor information and recognition results are acquired. The sensor information acquisition unit 702 acquires sensor information from the transmission unit, and notifies the image collection unit when specific sensor information is acquired. This sensor information is, for example, a human sensor, entry / exit card information using RFID, radio wave information using a mobile phone, or the like. If card information at the time of entry / exit is used, it is possible to roughly estimate how many people have entered the camera shooting area from where. The evaluation data storage unit 703 stores the images collected by the image collection unit 701. At this time, sensor information and recognition result information are added and recorded.

評価実行制御部７０４では、評価データ蓄積部７０３の画像に対する認識処理の実行命令を、選択制御部１３１に送付する。選択制御部１３１では、評価実行の際は全てのオブジェクト検出方式１〜Ｎ、特徴抽出方式１〜Ｍを実行するように制御命令を出す。この評価実行制御部は、例えば夜間などの時間帯や、人が誰も映っておらずにＣＰＵ２１２の処理負荷が軽い時間帯に、実行命令を出力する。 The evaluation execution control unit 704 sends a recognition processing execution command for the image of the evaluation data storage unit 703 to the selection control unit 131. The selection control unit 131 issues a control command to execute all the object detection methods 1 to N and feature extraction methods 1 to M at the time of evaluation execution. For example, this evaluation execution control unit outputs an execution command in a time zone such as nighttime or a time zone in which no one is shown and the processing load of the CPU 212 is light.

評価実行制御部７０４で命令された評価画像に対する認識結果は、認識結果出力部１３４から認識結果取得部７０５に入力され、結果判定部７０５へと送られる。また、認識結果取得部７０５は、評価実行時以外の通常時の認識結果も取得しており、ある特定の認識結果を取得したら画像収集部に、その結果を通知する。この特定の認識結果とは例えば、検出したオブジェクト数が多い場合や、多くの動体があるのにオブジェクトがほとんど検出できなかった場合などである。これらはそれぞれ、多くの人がいたであろう画像や、検出漏れが生じたであろう画像として評価画像として適しているため、画像収集部７０１において、この画像を収集するか判断し、蓄積する。結果判定部７０６では、複数方式の認識結果に対して精度評価を行い、その結果をまとめて統計解析部１３６に送付する。 The recognition result for the evaluation image instructed by the evaluation execution control unit 704 is input from the recognition result output unit 134 to the recognition result acquisition unit 705 and sent to the result determination unit 705. The recognition result acquisition unit 705 also acquires a recognition result at a normal time other than at the time of evaluation execution, and notifies the image collection unit of the result when a specific recognition result is acquired. This specific recognition result is, for example, a case where the number of detected objects is large, or a case where almost no objects can be detected although there are many moving objects. Each of these is suitable as an evaluation image as an image in which many people would have been detected or an image in which a detection failure would have occurred. Therefore, the image collection unit 701 determines whether to collect this image and stores it. . The result determination unit 706 performs accuracy evaluation on the recognition results of a plurality of methods, and collectively sends the results to the statistical analysis unit 136.

以上の構成をとることで、評価に適した画像を定期的に収集でき、更にはセンサ情報や認識結果を基に、その評価画像における正解情報（人の位置や、その人が誰であるかなど）を自動的に収集できる。こうして得られた評価画像に対して、必要な認識処理を邪魔することなく、複数の認識方式の評価を実施できる。 By adopting the above configuration, images suitable for evaluation can be collected periodically, and further, correct information (e.g., the position of the person and who is the person) based on the sensor information and recognition results. Etc.) can be collected automatically. Evaluation of a plurality of recognition methods can be performed on the evaluation image thus obtained without interfering with necessary recognition processing.

図８は、画像収集部７０１における画像収集の判断を行うフローチャートである。Ｓ８１にて判別処理を開始したら、まず、Ｓ８３においてセンサ情報取得部７０２から得られたセンサ値が閾値Ｓｓ以上であるかを判断する。この閾値は、例えばセンサが人感センサであればセンサの感知量であり、センサがＲＦＩＤなどによるゲートや扉に対する通過情報であれば、一定時間内の通過人数（たとえば３人以上）などとする。閾値以上であれば、Ｓ８６に遷移し、現在の画像を収集して一時メモリにキャッシュしておく。閾値未満であれば、Ｓ８４へと遷移する。Ｓ８４では、認識値を閾値Ｓｒと比較し、閾値以上であればＳ８６へ、閾値未満であれば、Ｓ８５へと遷移する。この認識値は例えば、動体領域数や、抽出した顔特徴量と事前登録された本人との距離の値となる。Ｓ８５では、以前に評価画像を記録した時間から、一定時間Ｔ２秒以上経過したかを判定し、経過してなければＳ８３の判定処理に戻り、経過していれば、Ｓ８７に遷移して一時保存していた画像を評価データ蓄積部に記録する。 FIG. 8 is a flowchart for determining image collection in the image collection unit 701. When the discrimination process is started in S81, first, in S83, it is determined whether the sensor value obtained from the sensor information acquisition unit 702 is equal to or greater than the threshold value Ss. For example, if the sensor is a human sensor, the threshold is the amount of the sensor, and if the sensor is passing information for a gate or a door using RFID or the like, the number of people passing within a certain time (for example, 3 or more) is set. . If it is equal to or greater than the threshold, the process proceeds to S86, where the current image is collected and cached in a temporary memory. If it is less than the threshold, the process proceeds to S84. In S84, the recognition value is compared with the threshold value Sr, and if it is equal to or greater than the threshold value, the process proceeds to S86, and if it is less than the threshold value, the process proceeds to S85. This recognition value is, for example, the value of the number of moving object regions or the distance between the extracted facial feature quantity and the person registered in advance. In S85, it is determined whether a predetermined time T2 seconds or more has elapsed from the time when the evaluation image was previously recorded. If it has not elapsed, the process returns to the determination process in S83, and if it has elapsed, the process proceeds to S87 to be temporarily stored. The recorded image is recorded in the evaluation data storage unit.

記録後は、Ｓ８２へと遷移して、Ｓ８２でＴ１秒間待機したのち、また画像収集を行うかを判別し続ける。Ｔ１秒待機するのは、同じような画像が保存されることを防ぐためである。以上の処理によって、一定間隔Ｔ２秒に一枚ずつ評価用画像を蓄積することができる。また、センサ値や認識値を用いて、より評価に適した画像が収集集可能となる。 After recording, the process proceeds to S82, and after waiting for T1 seconds in S82, it is continuously determined whether image collection is performed. The reason for waiting for T1 seconds is to prevent a similar image from being stored. Through the above processing, evaluation images can be accumulated one by one at a constant interval T2 seconds. Further, it is possible to collect and collect images more suitable for evaluation using sensor values and recognition values.

図９は、結果判定部７０６の判定結果の例を示す図である。例えば、出勤時の入退ゲートなどを通過する場所では、センサ情報として通過時のＩＤを基に、カメラに撮影されている人の数と、その人が誰であるかという情報を画像に付加して評価データ蓄積部７０３に記録される。この情報を基に、画像中の３人中何人を検出できたかという検出率と、特定人物から得た特徴量と登録されているその人物の特徴量との特徴量空間上の距離（類似度）とを評価結果として得られる。以上によって、センサ情報を基にオブジェクト検出と特徴量の精度を算出できる。 FIG. 9 is a diagram illustrating an example of the determination result of the result determination unit 706. For example, in places that pass through entrance / exit gates when going to work, information about the number of people photographed by the camera and who the person is is added to the image based on the ID at the time of passage as sensor information. And recorded in the evaluation data storage unit 703. Based on this information, the detection rate of how many of the three people in the image can be detected and the distance (similarity) between the feature quantity obtained from a specific person and the registered feature quantity of that person ) As an evaluation result. As described above, the accuracy of the object detection and the feature amount can be calculated based on the sensor information.

図１０は統計解析部１３６が保有する解析結果の一例を示す図である。統計解析部１３６は、評価部１３５で得られた評価結果をＤＢに蓄積し、一定時間間隔でこれらの蓄積されたデータに対して統計解析を行う。この結果、統計解析部は、解析結果テーブル１０１０を得る。テーブル１０１０ａと１０１０ｂは、それぞれカメラ１１０ａと１１０ｂの解析結果テーブルの一例を示す。 FIG. 10 is a diagram illustrating an example of an analysis result held by the statistical analysis unit 136. The statistical analysis unit 136 accumulates the evaluation results obtained by the evaluation unit 135 in the DB, and performs statistical analysis on the accumulated data at regular time intervals. As a result, the statistical analysis unit obtains an analysis result table 1010. Tables 1010a and 1010b show examples of analysis result tables of the cameras 110a and 110b, respectively.

結果テーブルは、特定の時間帯、天気、検出方式、検出領域、特徴抽出方式などの項を有し、時間帯毎に認識結果を分けることで、選択制御部１３１において特定の時間帯毎に認識方式を切替えることを可能とする。同様に、その際の天気情報などを有することで天気毎に認識方式を変えることが可能となる。検出方式の項目では、検出方式毎の認識率として正検出精度が、特徴抽出方式では各特徴量抽出方式に対する認証精度が評価データすべてから平均を算出することで得られる。これによって、ある時間帯にどの認識方式が最高性能であるかを統計的に把握できる。 The result table includes items such as a specific time zone, weather, detection method, detection area, and feature extraction method, and the recognition result is divided into time zones, so that the selection control unit 131 recognizes each specific time zone. It is possible to switch the method. Similarly, it is possible to change the recognition method for each weather by having the weather information at that time. In the item of the detection method, the positive detection accuracy is obtained as a recognition rate for each detection method, and in the feature extraction method, the authentication accuracy for each feature amount extraction method is obtained by calculating an average from all the evaluation data. Thereby, it is possible to statistically grasp which recognition method has the highest performance in a certain time zone.

また、検出領域は、検出したすべてのオブジェクトが存在した領域などを格納しておく。これによって、時間帯毎にオブジェクトが存在するであろう領域を把握できる。また、統計解析に用いた総データ数もこのテーブルに保存していても良い。これにより、各認識率の信頼度が判定でき、統計データが少ないときには方式選択を行わないということも可能である。このほかにも、照明状況毎、顔の大きさごと、ＣＰＵ負荷状況毎に統計解析テーブルを作成して出力しても良い。 The detection area stores an area where all detected objects exist. As a result, it is possible to grasp the region where the object will exist for each time slot. The total number of data used for statistical analysis may also be stored in this table. Thereby, the reliability of each recognition rate can be determined, and when the statistical data is small, it is possible not to select a method. In addition, a statistical analysis table may be created and output for each lighting situation, each face size, and each CPU load situation.

図１１は、選択制御部１３１の方式選択テーブルの一例である。各カメラ、時間帯毎にオブジェクト検出方式と特徴抽出方式を統計解析部１３６の結果を基に最高精度の認識処理行えるように規定する。実行中は、このテーブルに従ってオブジェクト検出部１３２と特徴抽出部１３３の処理内容を指定する。この方式制御テーブル１１１０は、１日毎や一週間毎など特定の期間で更新される。また、カメラ設置位置の変更などがあった場合は、リセットされる。 FIG. 11 is an example of a method selection table of the selection control unit 131. For each camera and time zone, an object detection method and a feature extraction method are defined so that recognition processing with the highest accuracy can be performed based on the result of the statistical analysis unit 136. During execution, the processing contents of the object detection unit 132 and the feature extraction unit 133 are designated according to this table. This method control table 1110 is updated at a specific period such as every day or every week. Also, if there is a change in the camera installation position, it is reset.

以上の構成を取ることで、監視システムを構成する複数個所に設置された膨大な数のカメラに対して、それぞれの環境に応じた最適な認識方式で画像認識処理を行うことができる。 By adopting the above configuration, it is possible to perform image recognition processing on an enormous number of cameras installed at a plurality of locations constituting the monitoring system using an optimum recognition method according to each environment.

図１２は、表示部１７０において監視システム管理画面を表示した一例である。監視システムの管理情報は、管理部１６０において管理されている。例えば、選択制御部１３１が保有する各カメラの方式制御テーブル１１１０や、統計解析部が保有する解析結果テーブル１０１０を一括して管理している。表示部１７０では、管理部１６０から必要な情報を抽出し、これを整形してわかりやすく表示する処理を行う。これらは、ＨＴＭＬやＦｌａｓｈなどを用いて構築できる。図１２（ａ）は、処理方式確認画面１２１０の一例であり、この画面上で各カメラで採用されているオブジェクト検出方式と特徴抽出方式を確認することができる。ここで各項目の設定を変更して、「更新」ボタンを押すことで管理部１６０を通じて選択制御部１３１の方式選択テーブルを変更することも可能である。これにより、多くの認識方式は統計解析結果を基に自動で決定して、微調整だけを手動を行うこともできる。また、「固定」ボタンを押すことで、決定した方式を今後更新しないで使用し続けることも可能となる。画面右下の現画像ウインドウには、カメラの現画像に対する認識結果を表示することで、画像を確認しながら認識方式の微調整が可能となる。 FIG. 12 is an example in which a monitoring system management screen is displayed on the display unit 170. Management information of the monitoring system is managed by the management unit 160. For example, the system control table 1110 of each camera held by the selection control unit 131 and the analysis result table 1010 held by the statistical analysis unit are collectively managed. The display unit 170 performs processing for extracting necessary information from the management unit 160, shaping it, and displaying it in an easy-to-understand manner. These can be constructed using HTML, Flash, or the like. FIG. 12A shows an example of a processing method confirmation screen 1210. On this screen, the object detection method and the feature extraction method adopted by each camera can be confirmed. Here, it is also possible to change the method selection table of the selection control unit 131 through the management unit 160 by changing the setting of each item and pressing the “update” button. As a result, many recognition methods can be automatically determined based on the statistical analysis result, and only fine adjustment can be performed manually. In addition, by pressing the “fix” button, it becomes possible to continue using the determined method without updating in the future. By displaying the recognition result for the current image of the camera in the current image window at the lower right of the screen, the recognition method can be finely adjusted while checking the image.

図１２（ｂ）は、評価結果確認画面１２２０の一例である。ここでは、管理部１６０を通じて、統計解析部１３６で得られた評価結果を確認できる。また、確認したい方式を選択することで、画面下のウインドウに評価画像とその認識結果と正解情報などを重畳表示し、評価結果を確認することが可能である。画面下のボタンで画像送り、戻りなどの再生制御が行え、画面右横のボタンで、評価画像の削除や正解情報の追加・修正が行える。これにより、評価画像の質を高められ、より正確な精度評価を実現可能となる。 FIG. 12B is an example of the evaluation result confirmation screen 1220. Here, the evaluation result obtained by the statistical analysis unit 136 can be confirmed through the management unit 160. In addition, by selecting a method to be confirmed, it is possible to superimpose and display an evaluation image, its recognition result, correct answer information, and the like in a window at the bottom of the screen and confirm the evaluation result. The buttons at the bottom of the screen can be used to control playback such as sending and returning images, and the buttons on the right side of the screen can be used to delete evaluation images and add / correct correct information. Thereby, the quality of the evaluation image can be improved and more accurate accuracy evaluation can be realized.

図１３は、表示部１７０において画像検索を実施する画面の一例である。検索結果表示画面１３１０は、クエリ画像ウインドウにおいて特定の顔画像を選択し、ウインドウ横にある検索ボタンを押すことで、類似する顔が映っている画像の検索を行う。検索命令は、特徴量データベース１４０に伝送され、特徴量データベース１４０内でクエリの顔と類似する特徴量を有するデータを検索し、類似する順（距離の近い順）にデータベース内のデータを出力する。表示部１７０では、取得したデータ内のフレームＩＤ情報を基に画像データベース１５０から画像を吸出し、画面上に表示する。この際、検出方式と特徴抽出方式をクエリ画像と同じ処理だった画像に対してのみ限定して検索結果を表示することも可能である。また、１，２方式だけが異なる「近方式」間の結果や、それ以上で方式が異なる「別方式」間の検索結果という形で、分けて表示することも可能である。例えば、同方式間では同じ顔の距離が０．１であったのに対して、別方式間では、同じ人の顔の距離が最小でも０．３にしかならないということがある。これを同検索結果として表示すると、別方式で撮影された顔は検索結果の後の方になってしまい、発見が困難となる。同方式、別方式毎に分けて表示することにより、同じ顔の画像検索結果を発見しやすくなる。 FIG. 13 is an example of a screen for performing an image search on the display unit 170. The search result display screen 1310 searches for an image showing a similar face by selecting a specific face image in the query image window and pressing a search button on the side of the window. The search command is transmitted to the feature quantity database 140, searches the feature quantity database 140 for data having a feature quantity similar to the face of the query, and outputs the data in the database in a similar order (in order of distance). . The display unit 170 draws out an image from the image database 150 based on the frame ID information in the acquired data and displays it on the screen. At this time, it is also possible to display the search result by limiting the detection method and the feature extraction method only to an image that has been processed in the same manner as the query image. It is also possible to display separately in the form of a result between “near methods” in which only the 1 and 2 methods are different, and a search result between “different methods” in which the methods are different beyond that. For example, while the same face distance is 0.1 between the same methods, the same person's face distance may be only 0.3 at least between the different methods. If this is displayed as the same search result, a face photographed by another method will be behind the search result, making it difficult to find. By displaying separately for the same method and different methods, it becomes easier to find the image search result of the same face.

以上を踏まえ、本実施例に記載の監視システムは、カメラと、カメラで撮影された入力画像から対象物を検出する検出部と、検出された対象物の特徴量を抽出する特徴量抽出部と、入力画像と対象物と特徴量とを蓄積する記憶部と、検出部と特徴量抽出部とを制御する選択制御部と、検出部と特徴量抽出部とからの結果を評価する評価部と、を有し、評価部からの出力に基づいて選択制御部は、検出部での検出方式および特徴量抽出部での抽出方式を選択することを特徴とする。 Based on the above, the monitoring system described in the present embodiment includes a camera, a detection unit that detects an object from an input image captured by the camera, and a feature amount extraction unit that extracts a feature amount of the detected object. A storage unit that accumulates an input image, an object, and a feature amount; a selection control unit that controls the detection unit and the feature amount extraction unit; and an evaluation unit that evaluates a result from the detection unit and the feature amount extraction unit; The selection control unit selects a detection method in the detection unit and an extraction method in the feature amount extraction unit based on the output from the evaluation unit.

かかる特徴によって、評価に適した画像を定期的に収集でき、更にはセンサ情報や認識結果を基に、その評価画像における正解情報（人の位置や、その人が誰であるかなど）を自動的に収集できる。 With these features, images suitable for evaluation can be collected periodically, and correct information (such as the person's position and who the person is) in the evaluation image is automatically based on sensor information and recognition results. Can be collected.

＜検出領域の限定＞
本発明の第２の実施例である検出領域限定方法について図１４を用いて説明する。図１４（ａ）は、あるカメラ１１０ａで撮影した通路の画像であり、図１４（ｂ）は統計解析部１３６に蓄積された認識結果を画像上に描画した一例を示した画像である。図１４（ｂ）に示すように、顔などのオブジェクトは通路上の限られたエリアに対してエリア毎に特定のサイズで出現する。このため、統計解析を行うことで図１４（ｃ）のように画像上の分割したエリア毎に存在するオブジェクトサイズを限定できる。図１４（ｃ）では、０と書かれた領域にはオブジェクトが出現しなかったことを示す。それ以外は、出現したオブジェクトの最小最大サイズを示している。このテーブルを選択制御部１３１で参照することで、オブジェクト検出を行う領域や、オブジェクトのサイズを制限することができ、処理時間の削減や誤検出除去を行うことが可能となる。
更には、統計処理で得られた出現分布を基に、カメラが設置されている位置のＭＡＰを推定することが可能となる。これをカメラが設置された施設等の地図データと照合することで、地図上のどこを撮影しているかを推定でき、自動的に現在のカメラ位置を推定できる。<Limitation of detection area>
A detection area limiting method according to the second embodiment of the present invention will be described with reference to FIG. FIG. 14A is an image of a passage taken by a certain camera 110a, and FIG. 14B is an image showing an example in which the recognition result accumulated in the statistical analysis unit 136 is drawn on the image. As shown in FIG. 14B, an object such as a face appears in a specific size for each area with respect to a limited area on the passage. Therefore, by performing statistical analysis, the object size existing for each divided area on the image can be limited as shown in FIG. FIG. 14C shows that no object appears in the area written as 0. Other than that, the minimum and maximum size of the appearing object is shown. By referring to this table by the selection control unit 131, it is possible to limit the area where the object is detected and the size of the object, and it is possible to reduce processing time and eliminate false detection.
Furthermore, it is possible to estimate the MAP at the position where the camera is installed based on the appearance distribution obtained by the statistical processing. By comparing this with map data of a facility or the like where the camera is installed, it is possible to estimate where on the map the image is taken, and to automatically estimate the current camera position.

＜避難誘導＞
本発明の第３の実施例である特殊状況時の監視システムについて図１５を用いて説明する。図１５は、選択制御部１３１が保有する特殊条件時方式制御テーブル１５１０の一例を示すものである。特殊条件時優先方式テーブル１５１０には、特殊条件が発生した際のオブジェクト検出部１３２と特徴抽出部１３３の処理の優先度を示す。特殊条件時優先方式テーブル１５１０は、緊急時の対応のため統計的に多くのデータを収集することができないため手動で設定しておく。特殊条件時優先方式テーブル１５１０には特殊状況の項を有し、火災、地震などの特殊状況毎に切替え認識方式の優先度を付けて置く。なお火災や地震などの特殊状況の発生は通信基盤１２０を通じてセンサから得ることができる。火災は、センサ１８０として、煙り検知センサを付けることで感知でき、地震は緊急地震速報の通知や震度計のセンサ情報を取得することで感知できる。これらのセンサ情報を選択制御部１３１で受信したら、特殊条件時優先方式テーブル１５１０と解析結果テーブル１０１０とから生成した特殊条件時方式制御テーブル１５２０に従って認識処理を行うことにする。特殊条件時方式制御テーブル１５２０は、解析結果テーブル１０１０の有する各方式の認識率Ｒと特殊条件時優先方式テーブル１５１０の優先度Ｐを掛け合わせた値が一定の閾値以上であり、かつ別方式の中で最大となった方式を選択する。これにより例えば地震時は顔検出は行わずに動体検出に専念させることや、あるカメラには何の認識処理もさせず、別の災害時に重要となるカメラに認識処理を専念させるといった切替えが可能となる。特殊条件時優先方式テーブル１５１０が持つ最小検出サイズは、これら特殊状況時のオブジェクトが画像に映る最小サイズを規定しておく。この値と統計解析部１３６が有するオブジェクトの最小・最大サイズを比較することで、特殊状況時にカメラのズームやパン・チルトをいくつにするべきか決定できる。これをカメラ切替え値として特殊条件時方式制御テーブル１５２０に記載しておくことで、各カメラの統計解析結果に基づいて特殊状況発生時に最適なカメラパラメータを設定できる。例えば、通常時は人の顔が鮮明に映るようにズーム気味に設定されていたカメラを、地震発生時にはより広角に撮影できるように切替えて全体を監視することが可能となる。また、特殊状況としてＰＴＺカメラの設定値毎にテーブルを用意しておけば、カメラのＰＴＺ向き毎に検出方式を切替えることも可能となる。
以上の構成により、地震などの災害や、渋滞などの混雑時といった特殊状況においても、事前に手動で設定した優先度と過去の統計量に基づいて、カメラ環境毎に最適な認識方式を実施することが可能となる。<Evacuation guidance>
A monitoring system for special situations according to the third embodiment of the present invention will be described with reference to FIG. FIG. 15 shows an example of the special condition time system control table 1510 held by the selection control unit 131. The special condition priority method table 1510 indicates the priority of processing of the object detection unit 132 and the feature extraction unit 133 when a special condition occurs. The special condition priority system table 1510 is set manually because a large amount of data cannot be collected statistically for emergency response. The special condition priority method table 1510 has a special situation item, and the priority of the switching recognition method is assigned to each special situation such as a fire or an earthquake. The occurrence of a special situation such as a fire or an earthquake can be obtained from the sensor through the communication infrastructure 120. A fire can be detected by attaching a smoke detection sensor as the sensor 180, and an earthquake can be detected by obtaining notification of an emergency earthquake warning or sensor information of a seismic intensity meter. When the sensor information is received by the selection control unit 131, the recognition process is performed according to the special condition time method control table 1520 generated from the special condition time priority method table 1510 and the analysis result table 1010. The special condition method control table 1520 has a value obtained by multiplying the recognition rate R of each method included in the analysis result table 1010 and the priority P of the special condition priority method table 1510 above a certain threshold value. Select the method that is the largest of them. This makes it possible, for example, to focus on moving object detection without performing face detection in the event of an earthquake, or to focus on recognition processing on a camera that is important in another disaster without performing any recognition processing on one camera. It becomes. The minimum detection size included in the special condition priority method table 1510 defines the minimum size at which the object in the special situation appears in the image. By comparing this value with the minimum / maximum size of the object included in the statistical analysis unit 136, it is possible to determine how much the zoom or pan / tilt of the camera should be in a special situation. By describing this in the special condition time system control table 1520 as a camera switching value, it is possible to set an optimal camera parameter when a special situation occurs based on the statistical analysis result of each camera. For example, the entire camera can be monitored by switching a camera that has been set to zoom so that a person's face can be clearly seen in normal times so that it can be photographed at a wider angle when an earthquake occurs. In addition, if a table is prepared for each setting value of the PTZ camera as a special situation, it is possible to switch the detection method for each PTZ direction of the camera.
With the above configuration, even in special situations such as disasters such as earthquakes and congestion such as traffic jams, the optimal recognition method is implemented for each camera environment based on priorities manually set in advance and past statistics. It becomes possible.

１１０ａ，１１０ｂ…カメラ、
１２０…通信基盤、
１３０…画像認識部、
１３１…選択制御部、
１３２…オブジェクト検出部、
１３３…特徴抽出部、
１３４…認識結果出力部、
１３５…評価部、
１３６…統計解析部、
１４０…特徴量データベース、
１５０…画像データベース、
１６０…管理部、
１７０…表示部、
１８０ａ…センサ
２１０…サーバ計算機、
２１１…Ｉ／Ｆ、
２１２…ＣＰＵ、
２１３…メモリ、
２１４…ＨＤＤ
６１０…認識結果テーブル
７０１…画像収集部、
７０２…センサ情報取得部、
７０３…評価データ蓄積部、
７０４…評価実行制御部、
７０５…認識結果取得部、
７０６…結果判定部
１０１０…解析結果テーブル
１１１０…方式制御テーブル
１２１０…処理方式確認画面、
１２２０…評価結果確認画面
１３１０…検索結果確認画面
１５１０…特殊条件時優先方式テーブル、
１５２０…特殊条件時方式制御テーブル。110a, 110b ... camera,
120 ... communication infrastructure,
130: Image recognition unit,
131 ... selection control unit,
132 ... object detection unit,
133 ... feature extraction unit,
134 ... recognition result output unit,
135 ... Evaluation department,
136 ... statistical analysis section,
140 ... feature quantity database,
150 ... Image database,
160 ... management part,
170 ... display section,
180a ... sensor 210 ... server computer,
211 ... I / F,
212 ... CPU,
213 ... Memory,
214 ... HDD
610 ... Recognition result table 701 ... Image collection unit,
702 ... Sensor information acquisition unit,
703 ... evaluation data storage unit,
704 ... Evaluation execution control unit,
705 ... Recognition result acquisition unit,
706 ... result determination unit 1010 ... analysis result table 1110 ... method control table 1210 ... processing method confirmation screen,
1220 ... Evaluation result confirmation screen 1310 ... Search result confirmation screen 1510 ... Special condition priority method table,
1520 ... Special condition time system control table.

Claims

A camera,
A detection unit for detecting an object from an input image photographed by the camera;
A feature amount extraction unit that extracts a feature amount of the detected object;
A storage unit for storing the input image, the object, and the feature amount;
A selection control unit that controls the detection unit and the feature amount extraction unit;
An evaluation unit that evaluates results from the detection unit and the feature amount extraction unit;
A monitoring system, wherein a selection control unit selects a detection method in the detection unit and an extraction method in the feature amount extraction unit based on an output from the evaluation unit.

The monitoring system according to claim 1,
A monitoring system further comprising a statistical analysis unit that accumulates the results output from the evaluation unit, performs statistical analysis, and outputs the analysis results to the control selection unit.

The monitoring system according to claim 2,
A monitoring system characterized in that a face is detected as the object.

The monitoring system according to claim 2,
The storage unit accumulates the input image every specific time,
The detection unit and the feature amount extraction unit perform a plurality of detection methods and extraction methods on the input image when the processing load of the detection unit and the feature amount extraction unit is small,
The said evaluation part evaluates the result of having performed each said detection method and each said extraction method, respectively, and accumulate | stores in the said memory | storage part.

The monitoring system according to claim 4,
A sensor information acquisition unit;
The monitoring system, wherein the image acquired by the sensor information acquisition unit is used by the evaluation unit.

The monitoring system according to claim 2,
The monitoring system, wherein the statistical analysis unit calculates the performance of the camera for each time period from the result output from the evaluation unit.

The monitoring system according to claim 3,
The statistical analysis unit calculates the appearance frequency of the face for each region on the input image by analyzing the position and size of the face detected in the past in the input image,
The monitoring system according to claim 1, wherein the selection control unit controls a size detected by the detection unit or an object to be detected according to the appearance frequency.

The monitoring system according to claim 4;
An input unit that accepts a query image specification from the user,
A search unit that searches the storage unit for the object having a feature amount similar to the feature amount extracted from the query image;
A display unit for displaying a result of the search unit.

The image search system according to claim 8,
The storage unit includes a feature amount storage unit that stores the feature amount, and an image storage unit that stores the input image,
The feature amount storage unit adds and accumulates information that identifies the detection method and the extraction method used in the extraction to the extracted feature amount,
The image display system, wherein the display unit displays a search result for each of the detection methods or the extraction methods.