JP4945477B2

JP4945477B2 - Surveillance system, person search method

Info

Publication number: JP4945477B2
Application number: JP2008040117A
Authority: JP
Inventors: 誠一平井
Original assignee: Hitachi Kokusai Electric Inc; Kokusai Denki Electric Inc
Current assignee: Kokusai Denki Electric Inc
Priority date: 2008-02-21
Filing date: 2008-02-21
Publication date: 2012-06-06
Anticipated expiration: 2028-02-21
Also published as: JP2009199322A

Description

本発明は、監視システムと、監視システムにおける人物検索方法に係り、人物検索を高精度に行うことができる監視システムと人物検索方法に関する。 The present invention relates to a monitoring system and a person search method in the monitoring system, and more particularly to a monitoring system and a person search method capable of performing a person search with high accuracy.

従来から、ホテル、ビル、コンビニエンスストア、金融機関、ダム、又は道路のような不特定多数の人が訪れる施設には、犯罪抑止や事故防止等の目的で、映像監視システムが設置されている。
このような映像監視システムでは、監視対象の人物等をカメラ等の撮像装置で撮影し、その映像を、管理事務所や警備室等の監視センタに伝送する。
監視センタに常駐する監視者は、その映像を監視し、目的や必要に応じて、注意をしたり、あるいは映像を録画・保存する。 Conventionally, video surveillance systems have been installed in facilities visited by an unspecified number of people such as hotels, buildings, convenience stores, financial institutions, dams, and roads for the purpose of crime prevention and accident prevention.
In such a video monitoring system, a person or the like to be monitored is photographed by an imaging device such as a camera, and the video is transmitted to a monitoring center such as a management office or a security room.
A monitor resident in the monitoring center monitors the video, and pays attention or records / stores the video according to the purpose and necessity.

この映像監視システムで映像の録画・保存をするための記録媒体として、以前は、ビデオテープ等のシーケンシャル・アクセスの記憶媒体が用いられていた。
しかし、近年では、ハードディスクドライブ（ＨＤＤ）に代表されるランダムアクセス可能な記憶媒体が、映像監視システムに用いられる事例が増えてきている。
ランダムアクセス可能な記憶媒体は、映像の整理がしやすく、繰り返し使用に強いため、普及が進んでいる。 As a recording medium for recording and storing video in this video surveillance system, a sequential access storage medium such as a video tape has been used in the past.
However, in recent years, an example in which a randomly accessible storage medium represented by a hard disk drive (HDD) is used in a video surveillance system is increasing.
Randomly accessible storage media are becoming popular because they are easy to organize videos and are resistant to repeated use.

さらに、このランダムアクセス可能な記録媒体は、年々、大容量化が進んでいる。
この大容量化により、録画できる映像のデータ量が、飛躍的に増大している。これにより、長時間録画が可能になった。
しかしながら、大量の録画映像を目視でチェックする際の監視者の負担の増加が、問題として顕在化しつつある。 Furthermore, the capacity of this randomly accessible recording medium is increasing year by year.
With this increase in capacity, the amount of video data that can be recorded has increased dramatically. This made it possible to record for a long time.
However, an increase in the burden on the observer when visually checking a large amount of recorded video is becoming a problem.

そこで、所望の映像をより簡単に見つけ出すための検索機能を備えた映像監視システムが普及しつつある。
この検索機能を備えた映像監視システムにおいては、時刻や外部センサ値の情報を映像とともに記憶する機能を備えていることがある。
外部センサとしては、人感センサ等のアラームがよく用いられている。また、このアラームが出力する値は０又は１の２値であることが多い。実際には、アラームの出力が１になった場合には、その旨と、その時刻とをテキスト情報として記憶することが多い。
このような従来からの検索機能では、人感センサ等のアラームの出力値が１となった際のテキスト情報を検索キーとして、映像を検索するようにしている。
さらに近年では、画像情報を検索キーとする検索機能を備えた監視システムが存在する。
その画像情報を検索キーとする監視システムの一種として、人物検索機能を備えた録画装置が存在する。 Therefore, video surveillance systems having a search function for finding a desired video more easily are becoming widespread.
A video monitoring system having this search function may have a function of storing time and external sensor value information together with video.
An alarm such as a human sensor is often used as an external sensor. Also, the value output by this alarm is often a binary value of 0 or 1. Actually, when the alarm output is 1, the fact and the time are often stored as text information.
In such a conventional search function, a video is searched by using text information when an alarm output value of a human sensor becomes 1 as a search key.
In recent years, there are monitoring systems having a search function using image information as a search key.
As a kind of monitoring system using the image information as a search key, there is a recording apparatus having a person search function.

人物検索機能とは、ある画像中に映っている人物をユーザが指定すると、その人物と同一人物が映っている他の映像を録画装置内から探し出し、ユーザに一覧提示する機能である。
この際、同一人物判定の手掛かりとして、人物の「顔」の画像が使われることが多い。
このような顔の画像から人物を検索する従来のシステムとしては、例えば、特許文献１を参照すると、多数の人物の顔を検索する、人物検索システム、人物追跡システム、人物検索方法、及び人物追跡方法が開示されている（以下、従来技術１とする。）。
従来技術１のような人物検索システムの人物検索機能は、ある特定の人物について、他の時刻あるいは他の場所での挙動の調査や、移動の軌跡の調査などに利用できる。このため、大変有効である。 The person search function is a function that, when a user designates a person shown in a certain image, searches for other videos showing the same person as that person in the recording apparatus and presents the list to the user.
In this case, a person's “face” image is often used as a clue for determining the same person.
As a conventional system for searching for a person from such a face image, for example, referring to Patent Document 1, a person search system, a person tracking system, a person search method, and a person tracking that search for faces of a large number of people A method is disclosed (hereinafter referred to as Prior Art 1).
The person search function of the person search system as in the prior art 1 can be used for investigating the behavior of a specific person at other times or places, or for examining the trajectory of movement. For this reason, it is very effective.

この従来技術１のような、顔の画像を用いた従来の人物検索機能を備えたシステムの一例について、図９と図１０を参照して説明する。
以降、説明を容易にするため、ユーザが指定した画像を検索入力画像、検索対象となる録画画像を検索対象画像という。
また、検索の結果、同一人物画像として提示される画像を検索出力画像という。
さらに、検索出力画像をサムネイル表示したものの一覧を、検索出力画像一覧という。 An example of a system having a conventional person search function using a face image, such as the prior art 1, will be described with reference to FIG. 9 and FIG.
Hereinafter, for ease of explanation, an image designated by the user is referred to as a search input image, and a recorded image to be searched is referred to as a search target image.
An image presented as the same person image as a result of the search is referred to as a search output image.
Further, a list of thumbnails of search output images is referred to as a search output image list.

まず、図９を参照して、人物検索機能を備えた録画装置２０１を含んだ、従来の映像監視システムＸの制御構成を説明する。
ネットワーク２００は、各装置を結ぶ専用線やイントラネット、インターネット等のＩＰネットワーク等である。
録画装置２０１は、画像データをＨＤＤ等に記憶して録画する録画装置である。また、録画装置２０１は、人物検索機能を備えている。
撮像装置２０２は、ＣＣＤやＣＭＯＳ素子等で撮像した画像（動画の映像又は静止画像）にデジタル変換処理を施し、変換された画像データを、ネットワークを介して出力するネットワーク・カメラや監視カメラ等の装置である。
監視端末２０３は、録画装置２０１に録画された画像データをネットワークを介して取得し、液晶ディスプレイやＣＲＴのマスターモニタ等である表示部に画面表示する装置である。また、監視端末２０３は、内蔵されたＣＰＵやプログラムにより、検索入力画像をネットワーク２００経由で録画装置２０１に送信する。また、監視端末２０３は、録画装置２０１から送信された検索出力画像一覧を、画面表示することもできる。さらに、監視端末２０３は、このためのＯＳ（オペレーティング・システム）と、ＯＳ上で動作するプログラムであるユーザインタフェイスを備えている。 First, a control configuration of a conventional video surveillance system X including a recording device 201 having a person search function will be described with reference to FIG.
The network 200 is an IP network such as a dedicated line connecting each device, an intranet, or the Internet.
The recording device 201 is a recording device that stores image data in an HDD or the like for recording. In addition, the recording apparatus 201 has a person search function.
The imaging device 202 performs digital conversion processing on an image (moving image or still image) captured by a CCD or CMOS element, and outputs the converted image data via a network, such as a network camera or a surveillance camera. Device.
The monitoring terminal 203 is a device that acquires image data recorded in the recording device 201 via a network and displays the image data on a display unit such as a liquid crystal display or a CRT master monitor. In addition, the monitoring terminal 203 transmits the search input image to the recording device 201 via the network 200 by a built-in CPU or program. The monitoring terminal 203 can also display the search output image list transmitted from the recording apparatus 201 on the screen. Furthermore, the monitoring terminal 203 includes an OS (operating system) for this purpose and a user interface that is a program operating on the OS.

また、録画装置２０１は、ネットワーク部２１１、記憶部２１２、顔領域検出部２１３、顔特徴量抽出部２１４、顔特徴量記録部２１５、及び顔判定部２１６が、例えば共通のバスにより接続するように構成される。
ネットワーク部２１１は、ＬＡＮインターフェイス等であり、ネットワーク２００からの入出力を行う処理部である。ネットワーク部２１１は、撮像装置２０２から入力される画像データの受信、監視端末２０３からの映像配信リクエスト、検索入力画像の受信、監視端末２０３への映像や検索出力画像の送信を行う。
記憶部２１２は、ＲＡＭやＲＯＭやＨＤＤ等のランダムアクセスが可能な記憶媒体と、そのコントローラを備える部位であり、画像データやその他のデータの記録媒体への読み書きを行う。映像記録の際には、画像データに加え、画像データを再び取り出すためのＩＤ（Ｉｄｅｎｔｉｆｉｃａｔｉｏｎ、ＩＤデータ）も併せて記憶媒体に書き込みを行う。 In the recording apparatus 201, the network unit 211, the storage unit 212, the face area detection unit 213, the face feature amount extraction unit 214, the face feature amount recording unit 215, and the face determination unit 216 are connected by, for example, a common bus. Configured.
The network unit 211 is a LAN interface or the like, and is a processing unit that performs input / output from the network 200. The network unit 211 receives image data input from the imaging device 202, receives a video distribution request from the monitoring terminal 203, receives a search input image, and transmits a video and search output image to the monitoring terminal 203.
The storage unit 212 includes a storage medium capable of random access, such as a RAM, a ROM, and an HDD, and a controller thereof, and reads and writes image data and other data on the recording medium. At the time of video recording, in addition to the image data, an ID (Identification, ID data) for retrieving the image data again is written to the storage medium.

顔領域検出部２１３は、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）やこのＤＳＰ用のプログラム等を備える部位である。顔領域検出部２１３は、入力された画像データに対し、従来の一般的な顔検出のための画像認識処理を用いた顔検出を行う。顔領域検出部２１３は、映像中の顔の存在の有無判定をし、顔が存在する場合にはその領域の座標算出を行う。
顔特徴量抽出部２１４は、ＤＳＰ等を含んで構成される部位であり、顔領域検出部２１３で検出した顔領域に対して画像認識処理を用いて、顔特徴量算出を行う。ここで顔特徴量とは、例えば、顔の輪郭の形状や方向、皮膚の色、目、鼻、口といった主要構成要素の大きさ、形状、配置関係等を数値化したものを示す。
顔特徴量記録部２１５は、記憶部２１２への記憶を行うためのコントローラ等である。顔特徴量記録部２１５は、顔特徴量抽出部２１４で算出した顔特徴量を、記憶部２１２の記録媒体への読み書きを行う。 The face area detection unit 213 is a part including a DSP (Digital Signal Processor), a program for the DSP, and the like. The face area detection unit 213 performs face detection on the input image data using a conventional image recognition process for general face detection. The face area detection unit 213 determines whether or not there is a face in the video, and when there is a face, calculates a coordinate of the area.
The face feature amount extraction unit 214 is a part including a DSP and the like, and performs face feature amount calculation using image recognition processing on the face region detected by the face region detection unit 213. Here, the facial feature amount indicates, for example, a numerical value of the size, shape, arrangement relationship, and the like of main components such as the shape and direction of the face outline, skin color, eyes, nose, and mouth.
The face feature amount recording unit 215 is a controller or the like for performing storage in the storage unit 212. The face feature amount recording unit 215 reads / writes the face feature amount calculated by the face feature amount extraction unit 214 to / from the recording medium of the storage unit 212.

顔判定部２１６は、ＤＳＰ等を含んで構成される部位であり、検索の際に検索入力画像中の人物と録画映像中の人物との同一人物判定を行う部位である。
その上で、顔判定部２１６は、同一人物と判定された録画画像を集めて、検索出力画像一覧を生成する。この判定は、それぞれの画像にて求めた顔特徴量から「顔類似度」と呼ばれる値を算出し、算出した顔類似度と、記憶部２１２に記憶している所定の閾値との大小関係を基に判定する。 The face determination unit 216 is a part that includes a DSP or the like, and is a part that performs the same person determination between the person in the search input image and the person in the recorded video during the search.
In addition, the face determination unit 216 collects recorded images determined to be the same person, and generates a search output image list. In this determination, a value called “face similarity” is calculated from the face feature amount obtained in each image, and the magnitude relationship between the calculated face similarity and a predetermined threshold stored in the storage unit 212 is calculated. Judgment based on.

図１０を参照して、従来の映像監視システムＸにおける、録画装置２０１の人物検索機能を実際にユーザが使用する際の操作について説明する。図１０は、監視端末２０３の表示部に表示されるユーザインタフェイス画面の一例である。
このユーザインタフェイスでは、監視端末２０３の制御部が、監視端末２０３のキーボードや専用スイッチやジョグダイヤルやマウス等のポインティングデバイスを含む入力部の入力信号を検知して、ポインタ等を動かして表示する。さらに、監視端末２０３の制御部は、表示部の画面上に描かれた入力キーであるボタン等をユーザが押下したことを検知して、各処理を行う。 With reference to FIG. 10, an operation when the user actually uses the person search function of the recording apparatus 201 in the conventional video monitoring system X will be described. FIG. 10 is an example of a user interface screen displayed on the display unit of the monitoring terminal 203.
In this user interface, the control unit of the monitoring terminal 203 detects an input signal of an input unit including a keyboard, a dedicated switch, a jog dial, a pointing device such as a mouse of the monitoring terminal 203, and moves and displays the pointer. Furthermore, the control unit of the monitoring terminal 203 detects that the user has pressed a button or the like that is an input key drawn on the screen of the display unit, and performs each process.

以下で、より具体的に、ユーザインタフェイス画面の説明をする。
再生映像表示部３０１は、再生中の映像を表示する領域である。再生映像表示部３０１には、監視端末２０３が、録画装置２０１からネットワーク２００を介して送信されてきた画像データを復号して表示する。 The user interface screen will be described more specifically below.
The playback video display unit 301 is an area for displaying the video being played back. In the reproduced video display unit 301, the monitoring terminal 203 decodes and displays the image data transmitted from the recording apparatus 201 via the network 200.

再生操作部３０２は、再生操作をするボタンが表示された領域である。再生操作部３０２上の各ボタンには、それぞれ固有の再生種類が割当てられている。
再生映像表示部３０１に表示されている映像に対して、新たな再生命令を与えたい場合には、ユーザは再生操作部３０２上の再生命令に対応したボタンをポインティングデバイス等で選択してボタンを押下する。監視端末２０３の制御部は、ユーザインタフェイスにおいてボタンの押下を検出して、各種の再生種類に対応する動作を行う。
たとえば、「左向きの三角が２つ」のボタンは映像の高速巻き戻しを示し、「左向きの三角が１つ」は映像の通常スピードでの巻き戻し再生を示し、「四角」ボタンは映像の停止／一時停止を示し、「右向きの三角が１つ」は映像の通常スピードでの再生を示し、「右向きの三角が２つ」のボタンは映像の高速早送りを示す。
カメラ切替操作部３０３は、カメラ切り換え操作用のボタンを表示する領域である。カメラ切替操作部３０３上の各ボタンには、それぞれ録画対象となっているカメラが割当てられており、再生映像表示部３０１に表示されている映像を、他のカメラの映像に切り替えたい場合に押下する。 The reproduction operation unit 302 is an area where buttons for performing a reproduction operation are displayed. Each button on the playback operation unit 302 is assigned a unique playback type.
When it is desired to give a new playback command to the video displayed on the playback video display unit 301, the user selects a button corresponding to the playback command on the playback operation unit 302 with a pointing device or the like, and presses the button. Press. The control unit of the monitoring terminal 203 detects button presses in the user interface and performs operations corresponding to various playback types.
For example, a “two left triangles” button indicates fast rewind of the video, “one left triangle” indicates normal video rewind playback, and a “square” button stops the video. / "Pause", "Right triangle" means playback of video at normal speed, "Right triangle" button indicates fast forward of video.
The camera switching operation unit 303 is an area for displaying a camera switching operation button. Each button on the camera switching operation unit 303 is assigned with a camera to be recorded, and is pressed when switching the video displayed on the playback video display unit 301 to the video of another camera. To do.

検索入力画像指定部３０４は、検索入力画像を表示・指定する領域である。この指定は、検索入力画像指定部３０４に表示されている「取込」ボタン押下を検知して行う。
監視端末２０３の制御部は、ユーザが「取込」ボタンを押下したことを検知すると、再生映像表示部３０１に表示されている画像を、検索入力画像としてサムネイル表示する。 The search input image specifying unit 304 is an area for displaying and specifying the search input image. This designation is performed by detecting that the “capture” button displayed in the search input image designation unit 304 is pressed.
When the control unit of the monitoring terminal 203 detects that the user has pressed the “capture” button, the image displayed on the playback video display unit 301 is displayed as a thumbnail as a search input image.

検索操作部３０５は、検索操作部上の「検索」ボタンを押下することで、検索入力画像指定部３０４に表示された画像にて検索を行うための領域である。この例では図示しないが、この検索操作部３０５に、日時やカメラの指定をするボタンを備える場合もある。
監視端末２０３の制御部は、ユーザが「検索」ボタンの押下したことを検知すると、検索入力画像指定部３０４に表示されている画像（検索入力画像）がある場合は、録画装置２０１にこの検索用画像に係るフレーム番号をネットワーク２００を介して送信する。
録画装置２０１の制御部は、検索入力画像に係るフレーム番号を受信して、このフレーム番号に係る検索画像を記憶部２１２から読み出し、上述のように各部を用いて人物検索機能を実行する。そして、検索された人物を含む画像である検索出力画像（とそのフレーム番号）を、ネットワーク２００を介して監視端末２０３に送信する。 The search operation unit 305 is an area for performing a search on an image displayed on the search input image specifying unit 304 by pressing a “search” button on the search operation unit. Although not shown in this example, the search operation unit 305 may be provided with buttons for specifying the date and time and the camera.
When the control unit of the monitoring terminal 203 detects that the user has pressed the “search” button, if there is an image (search input image) displayed in the search input image specifying unit 304, the search unit 201 performs this search. The frame number related to the image for use is transmitted via the network 200.
The control unit of the recording apparatus 201 receives the frame number related to the search input image, reads the search image related to the frame number from the storage unit 212, and executes the person search function using each unit as described above. Then, a search output image (and its frame number) that is an image including the searched person is transmitted to the monitoring terminal 203 via the network 200.

検索出力画像一覧表示部３０６は、検索出力画像の一覧をサムネイル表示する領域である。
ここに表示された検索出力画像のいずれかを押下すると、再生映像表示部３０１の表示映像が押下画像に切り替わる。この切り替わりを頭出しという。 The search output image list display unit 306 is an area for displaying a list of search output images as thumbnails.
When one of the search output images displayed here is pressed, the display video of the playback video display unit 301 is switched to the pressed image. This switching is called cueing.

以上のように、「顔」を手掛かりとした人物検索機能は、膨大な検索対象画像の中から、目的の人物映像への頭出しが容易にできるため大変便利である。 As described above, the person search function using “face” as a clue is very convenient because it is possible to easily find a target person image from a large number of search target images.

特開２００３−２８１１５７号公報JP 2003-281157 A

しかしながら、従来技術１のような画像認識技術は、正面方向から映した顔画像に対する認識精度は優れているものの、それ以外の方向から映した顔画像に対する認識精度は低かった。また、画像中における顔の大きさ（解像度）も認識精度に大きく影響し、顔の大きさが小さい場合には、認識精度も低かった。
従って、広角レンズを備え、対象者を斜め上方から小さく映すことの多い映像監視システムにおいては、誤検索、すなわち異なる人物を同一人物と間違って認識したり、同一人物を異なる人物と間違って認識したりする場合がある、すなわち精度が低いという問題があった。 However, although the image recognition technology such as the prior art 1 has excellent recognition accuracy for the face image projected from the front direction, the recognition accuracy for the face image projected from the other direction is low. In addition, the size (resolution) of the face in the image greatly affects the recognition accuracy, and when the face size is small, the recognition accuracy is low.
Therefore, in a video surveillance system that is equipped with a wide-angle lens and often projects the subject small from diagonally above, an erroneous search, i.e., wrongly recognizing different people as the same person or wrongly recognizing the same person as a different person. In other words, there is a problem that the accuracy is low.

本発明は、このような状況に鑑みてなされたものであり、上述の課題を解消することを課題とする。 This invention is made | formed in view of such a condition, and makes it a subject to eliminate the above-mentioned subject.

本発明の人物検索方法は、画像を録画する監視システムにおける人物検索方法において、録画した複数の画像の画像データからそれぞれ求められる、撮影時刻情報、撮影位置情報、顔特徴量、及び着衣情報をランダムアクセス可能な記憶媒体にそれぞれ記憶し、前記撮影時刻情報が近傍か非近傍かの近傍性と、前記着衣情報が同一か非同一かの同一性との４通りの条件の組み合わせに対応する４個の数値からなる時刻＋着衣重み設定値群、前記撮影時刻情報が近傍か非近傍かの近傍性と、前記撮影位置情報が近傍か非近傍かの近傍性との４通りの条件の組み合わせに対応する４個の数値からなる時刻＋位置重み設定値群、前記撮影位置情報が近傍か非近傍かの近傍性と、前記着衣情報が同一か非同一かの同一性との４通りの条件の組み合わせに対応する４個の数値からなる位置＋着衣重み設定値群の少なくとも１つの入力を予め受け、検索入力画像の指定とともに検索の指示を受けたときに、前記録画した複数の画像のそれぞれについて、入力された前記重み設定値群の４個の数値の１つを録画した画像と前記検索入力画像との間の関係に応じて選択し、該録画した画像及び該検索入力画像の前記顔特徴量の差分を該選択した数値で重み付けして得られる総合類似度を用いて、録画された人物の同一性を判断することを特徴とする。
本発明の人物検索方法は、前記時刻＋着衣重み設定値群、前記時刻＋位置重み設定値群、及び前記位置＋着衣重み設定値群は、前記４個の数値の入力欄をマトリクス状に配置した時刻＋着衣重み設定表、時刻＋位置重み設定表、及び位置＋着衣設定表を有する重み設定画面を通じてユーザにより入力又は変更され、前記検索の指示は、前記時刻＋着衣重み設定値群、前記時刻＋位置重み設定値群、及び位置＋着衣重み設定値群の内、任意の複数の指定を含み、前記撮影時刻情報の近傍性、前記着衣情報の同一性及び前記撮影位置情報の近傍性は、それぞれ対応する所定の閾値との比較により判断され、前記総合類似度は、前記指定された複数の重み設定値群からそれぞれ選択された数値を掛け合わせて、前記顔特徴量の差分を重み付けしたものであることを特徴とする。
本発明の人物検索方法は、前記時刻＋着衣重み設定値群、前記時刻＋位置重み設定値群及び位置＋着衣重み設定値群は、以下の７つの条件、（１）近傍の時刻において同一着衣の人物は同一人物である可能性が高い（２）近傍でない時刻においても同一着衣の人物は同一人物である可能性がある（３）近傍の時刻において非同一着衣の人物は同一人物でない可能性が高い（４）近傍の時刻、近傍位置の人物は同一人物である可能性が高い（５）近傍の時刻、非近傍位置の人物は同一人物でない可能性が高い（６）近傍でない時刻、近傍位置の人物は同一人物である可能性がある（７）近傍位置、同一着衣の人物は同一人物である可能性があるに基づいて設定することを特徴とする請求項１又は２に記載の人物検索方法であることを特徴とする。
本発明の監視システムは、撮像装置、録画装置、及び監視端末を備えて画像を録画する監視システムであって、前記録画装置は、前記撮像装置で撮像した画像から顔特徴量を抽出する顔特徴量抽出手段と、前記撮像装置で撮像した画像から着衣特徴量を抽出する着衣特徴量抽出手段と、録画した複数の画像の画像データ、撮影時刻情報、撮影位置情報、前記顔特徴量抽出手段で抽出された顔特徴量、前記着衣特徴量抽出手段で抽出された着衣情報のうち、少なくとも１つ以上をランダムアクセス可能な記憶媒体に記憶する記憶手段と、前記録画した複数の画像のそれぞれについて、前記撮影時刻情報が近傍か非近傍かの近傍性と、前記着衣情報が同一か非同一かの同一性との４通りの条件の組み合わせに対応する４個の数値からなる時刻＋着衣重み設定値群、前記撮影時刻情報が近傍か非近傍かの近傍性と、前記撮影位置情報が近傍か非近傍かの近傍性との４通りの条件の組み合わせに対応する４個の数値からなる時刻＋位置重み設定値群、前記撮影位置情報が近傍か非近傍かの近傍性と、前記着衣情報が同一か非同一かの同一性との４通りの条件の組み合わせに対応する４個の数値からなる位置＋着衣重み設定値群の少なくとも１つを録画した画像と検索入力画像との間の関係に応じて選択し、該録画した画像及び該検索入力画像の前記顔特徴量の差分を該選択した数値で重み付けして得られる総合類似度を用いて、録画された人物の同一性を判断する総合判断手段とを備えることを特徴とする。
People search method of the present invention, the person searching method in a monitoring system for recording an image obtained from each of the image data of a plurality of images recorded, photographing time information, the photographing position information, the face feature amount, and randomly clothes information 4 items corresponding to four combinations of conditions , each of which is stored in an accessible storage medium and has the proximity of whether the shooting time information is near or not, and the identity of whether the clothing information is the same or not time + cLOTHING weight setting value group of numbers of, corresponding to the combination of the photographing time information and the vicinity of or near or non vicinity, condition 4 kinds of the photographing position information or non-neighboring or Locality vicinity four of numbers time + position weight setting value group, and neighborhood of whether the photographing position information near or non-neighboring, the clothes information conditions are four of the same or non-identical or identity combinations that Receiving at least one input of the corresponding four numbers consisting of positions + clothes weight setting value group in advance, when receiving a search instruction with the specified search input image, for each of the plurality of images the recording, One of four numerical values of the input weight setting value group is selected according to the relationship between the recorded image and the search input image, and the recorded image and the facial feature amount of the search input image the difference with the overall similarity obtained by weighted numerical values the selection, characterized in that to determine the identity of the recorded person.
In the person search method of the present invention, the time + clothing weight setting value group, the time + position weight setting value group, and the position + clothing weight setting value group have the four numeric input fields arranged in a matrix. The time + clothing weight setting table, the time + position weight setting table, and the position + clothing setting table are input or changed by the user through the weight setting screen, and the search instruction is the time + clothing weight setting value group, Including time + position weight setting value group and position + clothing weight setting value group, arbitrary plural designations, the proximity of the shooting time information, the identity of the clothing information and the proximity of the shooting position information are The total similarity is weighted by the difference between the facial feature values by multiplying the selected values from the plurality of designated weight setting value groups. Also And characterized in that.
According to the person search method of the present invention, the time + clothing weight setting value group, the time + position weight setting value group, and the position + clothing weight setting value group have the following seven conditions: (2) The person with the same clothes may be the same person even at a time that is not near (3) The person with non-identical clothes may not be the same person at a time near (4) It is highly likely that the person in the vicinity and the person in the vicinity are the same person. (5) The person in the vicinity and the person in the non-neighbor position are highly likely not to be the same person. (6) 3. The person according to claim 1 or 2, wherein the person at the position is set based on the possibility that the person at the position is the same person (7) the position near the person and the person at the same clothes may be the same person. characterized in that it is a search method
The monitoring system of the present invention is a monitoring system that includes an imaging device, a recording device, and a monitoring terminal to record an image, and the recording device extracts a facial feature amount from an image captured by the imaging device. A quantity extraction unit; a clothing feature quantity extraction unit that extracts a clothing feature quantity from an image captured by the imaging device; and image data of a plurality of recorded images, shooting time information, shooting position information, and the face feature quantity extraction unit. For each of the recorded face image and a plurality of recorded images , storage means for storing at least one or more of the clothes information extracted by the clothing feature value extraction means in a randomly accessible storage medium ; A time consisting of four numerical values corresponding to a combination of four conditions: proximity of whether the shooting time information is near or non-neighbor and identity of whether the clothing information is the same or non-identical + A set of weights, consisting of four numerical values corresponding to four combinations of conditions: proximity of whether the photographing time information is near or non-neighbor and proximity of whether the photographing position information is near or non-neighbor. Four numerical values corresponding to combinations of four conditions: time + position weight setting value group, proximity of whether the shooting position information is near or non-near, and identity of whether the clothing information is the same or non-identical The position + clothing weight set value group consisting of: is selected according to the relationship between the recorded image and the search input image, and the difference between the recorded image and the face feature amount of the search input image is And a comprehensive judgment means for judging the identity of the recorded person using the total similarity obtained by weighting with the selected numerical value .

本発明によれば、精度を向上させた映像監視システムと人物検索方法を提供することができる。 According to the present invention, it is possible to provide a video surveillance system and a person search method with improved accuracy.

＜第１の実施の形態＞
〔制御構成〕
図１を参照して、本発明の実施の形態に係る人物検索方法による人物検索機能を備える録画装置１０１を含む映像監視システムＹの制御構成について説明する。
ネットワーク１００は、各装置を結ぶ、ＬＡＮ、光ファイバー、ｃ．ｌｉｎｋ、無線ＬＡＮ、メッシュネットワーク等のデータ通信可能な回線である。また、ネットワーク１００は、専用線、イントラネット、インターネット等のＩＰネットワーク等を用いてもよい。
録画装置１０１は画像データをＨＤＤ等に記憶して録画する録画装置である。また、録画装置１０１は、従来技術の録画装置２０１と同様に顔検索機能を備えており、さらに着衣検出機能を備えている。この着衣検出機能と、顔検出機能とを組み合わせることで、本発明の実施の形態に係る精度を向上させた人物検索方法を提供することができる。この人物検索方法については後述する。
撮像装置１０２は、撮像装置２０２と同様な機能をもつ画像撮影用の装置である。
監視端末１０３は、監視端末２０３と同様な機能をもつ装置であり、ＰＣ／ＡＴ互換機やＭＡＣ等であるＰＣ（パーソナル・コンピュータ）の記憶部に記憶したプログラムにより実現してもよいし、専用の監視端末装置として実現してもよい。また、監視端末１０３は、録画装置１０１の人物検索方法を用いるためのユーザインタフェイスも備えている。 <First Embodiment>
[Control configuration]
With reference to FIG. 1, a control configuration of a video surveillance system Y including a recording device 101 having a person search function according to a person search method according to an embodiment of the present invention will be described.
The network 100 includes a LAN, an optical fiber, c. A line capable of data communication such as a link, a wireless LAN, and a mesh network. The network 100 may be an IP network such as a dedicated line, an intranet, or the Internet.
The recording device 101 is a recording device that stores image data in an HDD or the like for recording. In addition, the recording apparatus 101 has a face search function as in the conventional recording apparatus 201, and further has a clothing detection function. By combining this clothing detection function and the face detection function, a person search method with improved accuracy according to the embodiment of the present invention can be provided. This person search method will be described later.
The imaging device 102 is an image capturing device having the same function as the imaging device 202.
The monitoring terminal 103 is a device having the same function as the monitoring terminal 203, and may be realized by a program stored in a storage unit of a PC (personal computer) such as a PC / AT compatible machine or a MAC, or a dedicated terminal You may implement | achieve as a monitoring terminal device. The monitoring terminal 103 also has a user interface for using the person search method of the recording apparatus 101.

録画装置１０１は、ネットワーク部１１１、記憶部１１２（記憶手段）、顔領域検出部１１３、顔特徴量抽出部１１４（顔特徴量抽出手段）、顔特徴量記録部１１５、着衣領域検出部１１６、着衣特徴量抽出部１１７（着衣特徴量抽出手段）、着衣特徴量記録部１１８、撮影時刻記録部１１９、撮影位置記録部１２０、総合判定部１２１（総合判断手段）が、例えば共通のバスにより接続するように構成される。 The recording apparatus 101 includes a network unit 111, a storage unit 112 (storage unit), a face region detection unit 113, a face feature amount extraction unit 114 (face feature amount extraction unit), a face feature amount recording unit 115, a clothing region detection unit 116, The clothing feature value extraction unit 117 (clothing feature value extraction unit), the clothing feature value recording unit 118, the shooting time recording unit 119, the shooting position recording unit 120, and the overall determination unit 121 (overall determination unit) are connected by, for example, a common bus. Configured to do.

ネットワーク部１１１は、装置外部からの入出力を行うＬＡＮインターフェイス等の処理部である。ネットワーク部１１１は、撮像装置１０２が送信する画像データを受信を行う。また、ネットワーク部１１１は、監視端末１０３からの映像配信リクエストや検索入力画像の受信を行う。さらに、ネットワーク部１１１は、監視端末１０３への画像データや検索出力画像のデータの送信を行う。
記憶部１１２は、ＲＡＭやＲＯＭやフラッシュメモリや光ディスクや磁気テープやＨＤＤ等の記憶媒体と、インテリジェントなコントローラやＣＰＵやＭＰＵ等を備える部位である。記憶部１１２は、画像（映像）データやその他のデータの記録媒体への読み書きを行う。映像記録の際には、画像データに加え、画像データを再び取り出す為のＩＤ（アイデンディフィケーション）、タイムコード、フレーム番号、各種タグデータ、ハッシュデータ等である映像ＩＤについても合わせて、データベース等を用いて書き込みを行う。また、記憶部１１２は、コントローラやＣＰＵやＭＰＵにより、データベースから映像ＩＤを検索して記憶媒体から読み出し・書き込みを行うこともできる。
顔領域検出部１１３は、顔領域検出部２１３と同等の機能を備えるＤＳＰやこのＤＳＰ用のプログラム等を含んで構成される。顔領域検出部１１３は、入力された画像データに対し画像認識技術を用いた顔検出を行う。さらに、顔領域検出部１１３は、映像中の顔の存在の有無判定をし、顔が存在する場合にはその領域の座標算出を行う。
顔特徴量抽出部１１４は、顔特徴量抽出部２１４と同等の機能を備えるＤＳＰを含んで構成される。また、顔特徴量抽出部１１４が抽出する顔特徴量は、例えば、顔の輪郭の形状や方向、皮膚の色、目や鼻、口といった主要構成要素の大きさ、形状、配置関係等々から、統計的に個人毎に差異が現れるベクトル成分や統計量等を用いることができる。また、本発明の実施の形態に係る顔特徴量抽出部１１４においては、使用する顔特徴量の種類や数は任意である。さらに、髪の長さや推定骨格（顎の大きさ等）の特徴等から、高い確率で男女を判定することも可能である。
顔特徴量記録部１１５は、記憶部１１２への記憶を行うためのコントローラ等を含んで構成される。顔特徴量記録部１１５は、顔特徴量抽出部１１４で算出した顔特徴量について、記憶部１１２の記録媒体への読み書きを行う。 The network unit 111 is a processing unit such as a LAN interface that performs input / output from the outside of the apparatus. The network unit 111 receives image data transmitted from the imaging apparatus 102. Further, the network unit 111 receives a video distribution request and a search input image from the monitoring terminal 103. Further, the network unit 111 transmits image data and search output image data to the monitoring terminal 103.
The storage unit 112 includes a storage medium such as a RAM, a ROM, a flash memory, an optical disk, a magnetic tape, and an HDD, an intelligent controller, a CPU, an MPU, and the like. The storage unit 112 reads and writes image (video) data and other data to and from a recording medium. When recording video, in addition to image data, ID (identification) for retrieving image data again, time code, frame number, various tag data, video ID such as hash data, database, etc. Write using. In addition, the storage unit 112 can retrieve a video ID from a database by using a controller, a CPU, or an MPU, and read / write from / to a storage medium.
The face area detection unit 113 includes a DSP having the same function as the face area detection unit 213, a program for the DSP, and the like. The face area detection unit 113 performs face detection using image recognition technology on the input image data. Furthermore, the face area detection unit 113 determines whether or not there is a face in the video, and if a face exists, calculates the coordinates of the area.
The face feature amount extraction unit 114 includes a DSP having the same function as the face feature amount extraction unit 214. Further, the facial feature amount extracted by the facial feature amount extraction unit 114 is, for example, from the shape and direction of the face outline, the color of the skin, the size, shape, arrangement relationship, etc. of main components such as eyes, nose and mouth. It is possible to use a vector component or a statistic that statistically shows a difference for each individual. Further, in the facial feature quantity extraction unit 114 according to the embodiment of the present invention, the type and number of facial feature quantities to be used are arbitrary. Furthermore, it is possible to determine the sexes with a high probability from the characteristics of the hair length and the estimated skeleton (such as the size of the jaw).
The face feature amount recording unit 115 includes a controller or the like for performing storage in the storage unit 112. The face feature amount recording unit 115 reads / writes the face feature amount calculated by the face feature amount extraction unit 114 to / from the recording medium of the storage unit 112.

着衣領域検出部１１６は、ＤＳＰやこのＤＳＰ用のプログラム等を含んで構成され、入力された画像データに対し、画像認識技術を用いた着衣領域の座標算出を行う。この画像認識技術としては、着衣量の算出に用いることができる公知の技術を用いることができる。たとえば、動的プログラミング等を用いて人物の輪郭を抽出し、その輪郭内の色の分布やテクスチャの周波数的な特徴（ＦＦＴやウェーブレット変換等を行ったときの周波数分布等）から着衣（服）であると検出するような技術を用いることができる。さらに、着衣領域の座標から、背の高さを推定することができ、大人か子供かについて判定できる。
着衣特徴量抽出部１１７は、着衣領域検出部１１６で検出した着衣領域に対して画像認識技術を用いて着衣特徴量算出を行う。ここで着衣特徴量とは、例えば、上述の着衣の色の分布や周波数的な特徴等が挙げられる。また、本発明においては使用する着衣特徴量の種類や数は任意である。また、この着衣特徴量抽出部１１７は、右前あるいは左前といった特徴、曲線的な服装等である特徴、ワンピース等の特徴についても判定することが可能である。これにより、この着衣を着ている人物が男女のどちらであるか、高い確率で判定できる。
着衣特徴量記録部１１８は、記憶部１１２への記憶を行うためのコントローラ等を含んで構成される。着衣特徴量記録部１１８は、着衣特徴量抽出部１１７で算出した着衣特徴量について、記憶部１１２の記録媒体への読み書きを行う。 The clothing region detection unit 116 includes a DSP, a program for the DSP, and the like, and performs coordinate calculation of the clothing region using image recognition technology on the input image data. As this image recognition technique, a known technique that can be used for calculating the amount of clothes can be used. For example, a person's contour is extracted using dynamic programming, etc., and clothes (clothes) are derived from the color distribution within the contour and the frequency characteristics of the texture (frequency distribution when FFT, wavelet transform, etc. are performed). It is possible to use a technique for detecting that it is. Furthermore, the height of the back can be estimated from the coordinates of the clothing area, and it can be determined whether the child is an adult or a child.
The clothing feature value extraction unit 117 calculates the clothing feature value using the image recognition technique for the clothing region detected by the clothing region detection unit 116. Here, the clothing feature amount includes, for example, the color distribution and frequency characteristics of the clothing described above. In the present invention, the type and number of clothing feature values used are arbitrary. In addition, the clothing feature quantity extraction unit 117 can also determine a feature such as right front or left front, a feature such as a curved outfit, and a feature such as a one-piece. This makes it possible to determine with high probability whether the person wearing the clothes is male or female.
The clothing feature value recording unit 118 includes a controller for storing data in the storage unit 112. The clothing feature value recording unit 118 reads / writes the clothing feature value calculated by the clothing feature value extraction unit 117 from / to the recording medium of the storage unit 112.

撮影時刻記録部１１９は、画像が撮影された際の時刻である撮影時刻情報について、記憶部１１２の記録媒体への読み書きを行う。
この撮影時刻情報は、例えば、ＧＰＳ（グローバル・ポジショニング・システム）や、ネットワーク部１１１を介してＮＴＰ（ネットワーク・タイム・プロトコル）等を用いることで、正確な時刻を記憶することが好適である。
また、各撮像装置１０２で基準となる時計の時刻に誤差があった場合、この誤差を計測した上で、「マスタークロック」となる録画装置１０１の時刻に合わせて補正して記憶する。 The shooting time recording unit 119 reads / writes the shooting time information, which is the time when the image is shot, to the recording medium in the storage unit 112.
As this photographing time information, it is preferable to store an accurate time by using, for example, GPS (Global Positioning System), NTP (Network Time Protocol) or the like via the network unit 111.
Further, when there is an error in the time of the reference clock in each imaging apparatus 102, the error is measured and corrected and stored in accordance with the time of the recording apparatus 101 as the “master clock”.

撮影位置記録部１２０は、記憶部１１２への記憶を行うためのコントローラ等を含んで構成される。撮影位置記録部１２０は、映像の撮影位置情報について、記憶部１１２の記録媒体への読み書きを行う。撮影位置情報は、例えばカメラのＧＰＳ位置情報のような物理的情報であってもよいし、カメラ設置番号のような、別途設けられた位置算出手段への参照情報であってもよい。 The photographing position recording unit 120 is configured to include a controller or the like for performing storage in the storage unit 112. The shooting position recording unit 120 reads / writes video shooting position information from / to a recording medium in the storage unit 112. The shooting position information may be physical information such as GPS position information of a camera, for example, or may be reference information for a separately provided position calculation unit such as a camera installation number.

総合判定部１２１は、検索の際に検索入力画像中の人物と録画映像中の人物との同一人物判定を行い、同一人物と判定された録画画像を集めて、検索出力画像一覧を生成する。
この判定は、それぞれの画像にて求めた顔特徴量から「顔類似度」と呼ばれる値を算出し、算出した顔類似度に撮影時刻情報と撮影位置情報、後述する着衣情報から求めた重み値を加えた値（以降、総合類似度という。）、あらかじめ設定した所定の閾値等の大小関係において決定する。この顔類似度やその他の処理の詳細については後述する。
また、総合判定部１２１は、このような判定のために、ＲＡＭやレジスタやフラッシュメモリやＨＤＤ等の一時的な記憶手段を備えている。 The overall determination unit 121 performs the same person determination between the person in the search input image and the person in the recorded video during the search, collects the recorded images determined as the same person, and generates a search output image list.
In this determination, a value called “face similarity” is calculated from the facial feature amount obtained for each image, and the weight value obtained from shooting time information and shooting position information, and clothing information to be described later, for the calculated face similarity. Is determined based on a magnitude relationship such as a value obtained by adding (hereinafter referred to as total similarity) and a predetermined threshold value set in advance. Details of the face similarity and other processes will be described later.
In addition, the comprehensive determination unit 121 includes temporary storage means such as a RAM, a register, a flash memory, and an HDD for such determination.

なお、総合判定部１２１の一時的な記憶手段は、記憶部１１２の特定の領域や、記憶部１１２の記憶媒体に備えていてもよい。
また、総合判定部１２１は、制御部が実行するプログラムとして実現されていてもよく、この場合は、一時的な記憶手段は、プログラムの変数の値やファイルとして保持する。
さらに、顔領域検出部１１３、顔特徴量抽出部１１４、顔特徴量記録部１１５、着衣領域検出部１１６、着衣特徴量抽出部１１７、着衣特徴量記録部１１８、撮影時刻記録部１１９、撮影位置記録部１２０に関しても、制御部が実行するプログラムとして実現されていてもよい。
この場合は、特徴量抽出等の高速応答性やＳＩＭＤ（シングル・インストラクション・マルチプル・データ）形式の多大な演算性能が要求される演算（算出）処理を、別の各種ＤＳＰやＧＰＵ（グラフィック・プロセッシング・ユニット）を備えたチップやボード等で処理するようにしてもよい。 The temporary storage unit of the comprehensive determination unit 121 may be provided in a specific area of the storage unit 112 or a storage medium of the storage unit 112.
In addition, the comprehensive determination unit 121 may be realized as a program executed by the control unit. In this case, the temporary storage unit holds the value of the program variable or a file.
Furthermore, the face area detection unit 113, the face feature quantity extraction unit 114, the face feature quantity recording unit 115, the clothing area detection unit 116, the clothing feature quantity extraction unit 117, the clothing feature quantity recording unit 118, the shooting time recording unit 119, the shooting position. The recording unit 120 may also be realized as a program executed by the control unit.
In this case, computation processing that requires high-speed responsiveness such as feature quantity extraction and large computation performance in SIMD (single instruction multiple data) format is used for other various DSPs and GPUs (graphic processing). -You may make it process with the chip | tip or board provided with the unit.

この他に、映像監視システムＹにおいては、上記の各処理の制御を行うＣＰＵ等の制御部を備えている。制御部で実行されるプログラムは、各部をハードウェア資源を使用して実現するために用いられてもよく、記憶部１１２に記憶されていても、制御部内のＲＯＭやフラッシュメモリ等に記憶されていてもよい。 In addition, the video monitoring system Y includes a control unit such as a CPU that controls each of the processes described above. The program executed by the control unit may be used to implement each unit using hardware resources, and may be stored in the storage unit 112 or stored in a ROM or flash memory in the control unit. May be.

〔人物検索方法の精度向上のための手法〕
本発明の発明者は、鋭意実験と検討を繰り返した結果、人物検索機能の性能を高めるためには、同一人物の判定に、撮影時刻情報、撮影位置情報、着衣情報等を加えるのが有効であることを見いだし、本発明を完成するに至った。
具体的には、検索入力画像と検索対象画像との間で、撮影時刻情報、撮影位置情報、着衣情報等を比較することで、同一人物の判定を確実に行うことができる。
ここで、撮影時刻情報、撮影位置情報については、その人物が近場に存在するかどうかの近傍性（時間的、空間的な近さ）を判断する。
着衣情報は、着衣特徴量のベクトル等であり、着衣の特徴によりその人物が同一であるかどうかの同一性を判断する。
この判断結果については、撮影時刻情報、撮影位置情報、着衣情報等に、以下の条件（１）〜（７）を用いて算出した重みの値を適用して判断する。
（１）近傍の時刻において同一着衣の人物は同一人物である可能性が高い
（２）近傍でない時刻においても同一着衣の人物は同一人物である可能性がある
（３）近傍の時刻において非同一着衣の人物は同一人物でない可能性が高い
（４）近傍の時刻、近傍位置の人物は同一人物である可能性が高い
（５）近傍の時刻、非近傍位置の人物は同一人物でない可能性が高い
（６）近傍でない時刻、近傍位置の人物は同一人物である可能性がある
（７）近傍位置、同一着衣の人物は同一人物である可能性がある
この算出した重みを、顔特徴量を用いた同一人物の判断結果に掛け合わせることで、人物検索の検索結果の精度を飛躍的に向上させることが可能となった。
なお、（１）〜（７）の一部を判断するものであっても、人物検索精度を向上させることができる。 [Method for improving the accuracy of person search methods]
As a result of repeated experiments and examinations, the inventor of the present invention is effective to add shooting time information, shooting position information, clothing information, etc. to the determination of the same person in order to improve the performance of the person search function. We found something and came to complete the present invention.
Specifically, the same person can be reliably determined by comparing shooting time information, shooting position information, clothing information, and the like between the search input image and the search target image.
Here, for the shooting time information and the shooting position information, the proximity (temporal and spatial proximity) of whether or not the person exists in the near field is determined.
The clothing information is a vector of clothing feature values and the like, and the identity of whether or not the person is the same is determined based on the clothing features.
The determination result is determined by applying the weight value calculated using the following conditions (1) to (7) to shooting time information, shooting position information, clothing information, and the like.
(1) People in the same clothes are likely to be the same person at a nearby time (2) People in the same clothes may be the same person even at a time not nearby (3) Non-identical at a nearby time There is a high possibility that the person in the clothes is not the same person (4) The time in the vicinity and the person in the vicinity are likely to be the same person (5) The time in the vicinity and the person in the non-neighbor position may not be the same person High (6) Non-near time, person in neighborhood position may be the same person (7) Person in neighborhood position, same clothes may be the same person The calculated weight is used as the facial feature value By multiplying the judgment result of the same person used, it has become possible to dramatically improve the accuracy of the person search result.
In addition, even if it judges a part of (1)-(7), a person search precision can be improved.

〔データベース〕
次に、図２を参照して、本発明の実施の形態に係る特徴量データベース８０１について説明する。
上述のような精度向上のための手法を実現するため、本発明の実施の形態に係る録画装置１０１では、録画時に、映像ＩＤ、顔特徴量、着衣情報、撮影時刻情報、及び撮影位置情報を記憶する。
これらの情報を記憶するために、録画装置１０１は、記憶部１１２の記憶部の記憶媒体に、特徴量データベース８０１を備えている。
特徴量データベース８０１は、例えば、図２のように表形式で構成することができるが、どのようなものであってもよい。 [Database]
Next, the feature quantity database 801 according to the embodiment of the present invention will be described with reference to FIG.
In order to realize the above-described technique for improving accuracy, the recording apparatus 101 according to the embodiment of the present invention obtains a video ID, facial feature amount, clothing information, shooting time information, and shooting position information during recording. Remember.
In order to store such information, the recording apparatus 101 includes a feature amount database 801 in the storage medium of the storage unit of the storage unit 112.
The feature amount database 801 can be configured in a table format as shown in FIG. 2, for example, but may be any type.

特徴量データベース８０１の記憶内容のうち、映像ＩＤ列８１１には、映像ＩＤが記憶される。この映像ＩＤとしては、録画時刻を示すタイムコード又はフレーム番号を用いることができる。このフレーム番号は、例えば１つのストリームの先頭からユニークに割り振った番号であり、単調増加する連続値のような値を用いてもよい。
顔特徴量列８１２は、顔特徴量抽出部１１４で算出された顔特徴量が記憶される。格納する顔特徴量が複数ある場合にはここは複数列となる。
着衣特徴量列８１３は、着衣特徴量抽出部１１７で算出された着衣特徴量が記憶される。格納する着衣特徴量が複数ある場合にはここは複数列となる。
撮影時刻列８１４は、撮影時刻情報が記憶される。
撮影位置列８１５は、撮影位置情報が記憶される。ＧＰＳの位置情報等もここに記憶される。 Among the stored contents of the feature database 801, the video ID column 811 stores the video ID. As this video ID, a time code indicating a recording time or a frame number can be used. The frame number is, for example, a number uniquely assigned from the beginning of one stream, and a value such as a continuous value that monotonously increases may be used.
The face feature value column 812 stores the face feature value calculated by the face feature value extracting unit 114. When there are a plurality of facial feature quantities to be stored, this is a plurality of columns.
The clothing feature value column 813 stores the clothing feature value calculated by the clothing feature value extraction unit 117. When there are a plurality of clothing feature values to be stored, this becomes a plurality of columns.
The shooting time column 814 stores shooting time information.
The shooting position column 815 stores shooting position information. GPS position information and the like are also stored here.

特徴量データベース８０１の書き込み又は読み出しについては、記憶部１１２のＣＰＵ等により、例えばＳＱＬ等のデータベースのデーモン（バックグラウンド実行プログラム等）又はサービスが実行されている。このデーモン又はサービスは、各構成部位からのコマンドにより、特徴量データベース８０１の記憶内容について、読み出しや書き込み（格納）や追記等ができる。 For writing or reading of the feature amount database 801, a database daemon (background execution program or the like) or service such as SQL is executed by the CPU of the storage unit 112 or the like. This daemon or service can read, write (store), add information, etc., with respect to the storage contents of the feature amount database 801 by commands from each component.

顔特徴量記録部１１５は、顔特徴量抽出部１１４からの入力時に、この特徴量データベース８０１の最後尾行の次行、すなわち未格納行を書込み行として、映像ＩＤと顔特徴量を格納する。
着衣特徴量記録部１１８、撮影時刻記録部１１９、又は撮影位置記録部１２０は、書込み行となっている行に対して、それぞれ着衣特徴量、撮影時刻、撮影位置を格納する。
また、総合判定部１２１が実行するプログラムを用いても、特徴量データベース８０１の各行の読み出し（参照）や書き込み（記憶）が可能である。 The face feature amount recording unit 115 stores the video ID and the face feature amount by using the next row of the last feature row of the feature amount database 801, that is, the unstored row, as a write row when input from the face feature amount extraction unit 114.
The clothing feature value recording unit 118, the shooting time recording unit 119, or the shooting position recording unit 120 stores the clothing feature value, the shooting time, and the shooting position for the row that is the writing line.
Further, even using the program executed by the comprehensive determination unit 121, it is possible to read (reference) or write (store) each row of the feature amount database 801.

〔映像録画の際の処理〕
次に、図３を参照して、本発明の実施の形態に係る録画装置１０１で、実際に映像録画を行う際の処理の流れを説明する。
この映像録画の際に、上述のように撮影時刻情報、撮影位置情報、及び着衣情報についても記憶する。
また、この映像の録画を行う際の処理は、記憶部１１２のＲＯＭやフラッシュメモリやＨＤＤ等の記憶媒体に記憶されたプログラム等に従って、各部が協調して実行する。 [Processing during video recording]
Next, with reference to FIG. 3, the flow of processing when video recording is actually performed by the recording apparatus 101 according to the embodiment of the present invention will be described.
At the time of this video recording, the shooting time information, the shooting position information, and the clothing information are also stored as described above.
In addition, the processing for recording the video is performed in cooperation by each unit according to a program stored in a storage medium such as a ROM, flash memory, or HDD in the storage unit 112.

（ステップＳ７０１）
ステップＳ７０１において、ネットワーク部１１１は、映像受信処理を行う。
具体的には、ネットワーク部１１１は、撮像装置１０２からネットワーク１００を介して入力された画像データを受信して、記憶部１１２へ出力する。 (Step S701)
In step S701, the network unit 111 performs video reception processing.
Specifically, the network unit 111 receives image data input from the imaging apparatus 102 via the network 100 and outputs the image data to the storage unit 112.

（ステップＳ７０２）
ステップＳ７０２において、記憶部１１２は、画像データと映像ＩＤ記憶処理を行う。
すなわち、記憶部１１２は、入力された画像データを、ＨＤＤ等のランダムアクセス可能な記録媒体に記録する。
また、記憶部１１２は、映像を再び取り出す為の映像ＩＤも生成し、併せて記録する。そして、画像データを顔領域検出部１１３へ出力する。
以下で、本フローチャートの説明においては、明示的記載がなくとも画像には映像ＩＤを付してあるものとする。 (Step S702)
In step S702, the storage unit 112 performs image data and video ID storage processing.
That is, the storage unit 112 records the input image data on a randomly accessible recording medium such as an HDD.
The storage unit 112 also generates and records a video ID for taking out the video again. Then, the image data is output to the face area detection unit 113.
In the following description of the flowchart, it is assumed that a video ID is attached to an image even if there is no explicit description.

（ステップＳ７０３）
ステップＳ７０３において、顔領域検出部１１３は、入力された画像データに対し、顔検出処理を行う。
この顔検出処理においては、公知の技術を用いて、映像中の明度や肌色等の色彩やその他の特徴量等を用いて、顔（の画像）を検出して検出値（検出可能性、ｐ、プロバビリティー）を出力することが可能である。 (Step S703)
In step S703, the face area detection unit 113 performs face detection processing on the input image data.
In this face detection process, a known value is used to detect a face (image) using a color such as brightness and skin color in the video, and other feature quantities, and a detected value (detectability, p Outputability).

（ステップＳ７０４）
ステップＳ７０４において、総合判定部１２１は、顔が検出されたかについて判定する。
具体的には、総合判定部１２１は、所定の閾値を用いて、顔が検出されたか判定する。すなわち、上述の検出値が所定の閾値を超えていた場合はＹｅｓと判定し、それ以外の場合はＮｏと判定する。
Ｙｅｓの場合は、総合判定部１２１は、処理をステップＳ７０５に進める。
Ｎｏの場合は、総合判定部１２１は、処理をステップＳ７１７に進める。 (Step S704)
In step S704, the comprehensive determination unit 121 determines whether a face has been detected.
Specifically, the overall determination unit 121 determines whether a face has been detected using a predetermined threshold. That is, when the above-described detection value exceeds a predetermined threshold, it is determined as Yes, and otherwise it is determined as No.
In the case of Yes, the comprehensive determination unit 121 advances the process to step S705.
In No, the comprehensive determination part 121 advances a process to step S717.

（ステップＳ７０５）
ステップＳ７０５において、顔領域検出部１１３は、顔領域座標算出処理を行う。
具体的には、顔領域検出部１１３は、顔が検出された場合には、その領域の座標をピクセルの矩形の座標値等で算出する。
その上で、顔領域検出部１１３は、算出した座標値を、画像データとともに顔特徴量抽出部１１４へ出力する。 (Step S705)
In step S705, the face area detection unit 113 performs face area coordinate calculation processing.
Specifically, when a face is detected, the face area detection unit 113 calculates the coordinates of the area based on the coordinate value of the pixel rectangle.
After that, the face area detecting unit 113 outputs the calculated coordinate values to the face feature amount extracting unit 114 together with the image data.

（ステップＳ７０６）
ステップＳ７０６において、顔特徴量抽出部１１４は、顔特徴量算出処理を行う。
具体的には、顔特徴量抽出部１１４は、入力された映像と顔領域座標から得られる顔領域画像に対して顔特徴量を算出する。
顔特徴量抽出部１１４は、算出した顔特徴量を、画像データとともに顔特徴量記録部１１５へ出力する。 (Step S706)
In step S706, the face feature amount extraction unit 114 performs face feature amount calculation processing.
Specifically, the face feature quantity extraction unit 114 calculates a face feature quantity for a face area image obtained from the input video and face area coordinates.
The face feature amount extraction unit 114 outputs the calculated face feature amount together with the image data to the face feature amount recording unit 115.

（ステップＳ７０７）
ステップＳ７０７において、顔特徴量記録部１１５は、顔特徴量及び映像ＩＤ記憶処理を行う。
具体的には、顔特徴量記録部１１５は、入力された顔特徴量を、記憶部１１２の特徴量データベース８０１の顔特徴量列８１２に記録する。その際、映像ＩＤも併せて映像ＩＤ列８１１に記録する。
そして、顔特徴量記録部１１５は、画像データを着衣領域検出部１１６へ出力する。 (Step S707)
In step S707, the face feature amount recording unit 115 performs a face feature amount and video ID storage process.
Specifically, the face feature amount recording unit 115 records the input face feature amount in the face feature amount column 812 of the feature amount database 801 of the storage unit 112. At that time, the video ID is also recorded in the video ID column 811.
Then, the face feature amount recording unit 115 outputs the image data to the clothing region detection unit 116.

（ステップＳ７０８）
ステップＳ７０８において、着衣領域検出部１１６は、入力された画像データに対し着衣検出処理を行う。
この着衣検出処理においては、着衣領域検出部１１６は、上述したような各種の着衣の検出アルゴリズムを用いて、着衣の検出値を出力する。その際、予め記憶部１１２に記憶した撮像装置１０２のホワイトバランスや解像度等の特性により、検出パラメータ等を調整した上で用いるのが好適である。 (Step S708)
In step S708, the clothing region detection unit 116 performs clothing detection processing on the input image data.
In this clothing detection process, the clothing region detection unit 116 outputs a clothing detection value using various clothing detection algorithms as described above. At that time, it is preferable to use after adjusting the detection parameters and the like according to the characteristics such as the white balance and resolution of the imaging device 102 stored in advance in the storage unit 112.

（ステップＳ７０９）
ステップＳ７０９において、総合判定部１２１は、着衣が検出されたか判定する。
具体的には、総合判定部１２１は、着衣の検出値が記憶部１１２に記憶された所定の閾値よりも大きい場合はＹｅｓと判定し、それ以外の場合はＮｏと判定する。
Ｙｅｓ、すなわち着衣が検出された場合には、総合判定部１２１は、ステップＳ７１０に処理を進める。
Ｎｏ、すなわち着衣が検出されなかった場合は、総合判定部１２１は、ステップＳ７１３に処理を進める。 (Step S709)
In step S709, the overall determination unit 121 determines whether clothing has been detected.
Specifically, the overall determination unit 121 determines Yes when the detected value of clothing is greater than a predetermined threshold stored in the storage unit 112, and determines No otherwise.
If Yes, that is, if clothing is detected, the overall determination unit 121 proceeds to step S710.
If No, that is, if clothes are not detected, the overall determination unit 121 proceeds to step S713.

（ステップＳ７１０）
ステップＳ７１０において、着衣領域検出部１１６は、着衣領域座標算出処理を行う。
具体的には、着衣領域検出部１１６は、入力された画像データから着衣の領域の座標値を、映像のピクセルの矩形座標や特定ブロック等の形状やベクトル値や曲線の座標等で算出する。
着衣領域検出部１１６は、算出した座標値を画像データとともに、着衣特徴量抽出部１１７へ出力する。 (Step S710)
In step S710, the clothing region detection unit 116 performs clothing region coordinate calculation processing.
Specifically, the clothing area detection unit 116 calculates the coordinate value of the clothing area from the input image data using the rectangular coordinates of the video pixels, the shape of a specific block, the vector value, the coordinates of the curve, and the like.
The clothing region detection unit 116 outputs the calculated coordinate values to the clothing feature value extraction unit 117 together with the image data.

（ステップＳ７１１）
ステップＳ７１１において、着衣特徴量抽出部１１７は、着衣特徴量算出処理を行う。
具体的には、着衣特徴量抽出部１１７は、上述の各種のアルゴリズムにて、着衣の領域画像データに対して着衣特徴量を算出する。
また、着衣特徴量抽出部１１７は、算出した着衣特徴量を、画像データとともに着衣特徴量記録部１１８へ出力する。 (Step S711)
In step S711, the clothing feature value extraction unit 117 performs a clothing feature value calculation process.
Specifically, the clothing feature value extraction unit 117 calculates a clothing feature value for the region image data of the clothing using the various algorithms described above.
The clothing feature value extraction unit 117 outputs the calculated clothing feature value to the clothing feature value recording unit 118 together with the image data.

（ステップＳ７１２）
ステップＳ７１２において、着衣特徴量記録部１１８は、着衣特徴量記憶処理を行う。
具体的には、着衣特徴量記録部１１８は、入力された着衣特徴量を、記憶部１１２の特徴量データベース８０１の着衣特徴量列８１３のうち、映像ＩＤの一致する行のセルに記憶する。
この処理を行った後で、着衣特徴量記録部１１８は、処理をステップＳ７１５に進める。 (Step S712)
In step S712, the clothing feature value recording unit 118 performs a clothing feature value storage process.
Specifically, the clothing feature value recording unit 118 stores the input clothing feature value in the cell of the row with the matching video ID in the clothing feature value column 813 of the feature value database 801 of the storage unit 112.
After performing this process, clothing feature value recording unit 118 advances the process to step S715.

（ステップＳ７１３）
ステップＳ７１３において、着衣領域検出部１１６は、着衣検出なし情報出力処理を行う。
具体的には、着衣領域検出部１１６は、着衣が検出されなかった場合には、「検出なし」の情報（着衣検出なし情報）を着衣特徴量記録部１１８へ出力する。 (Step S713)
In step S713, the clothing area detection unit 116 performs information output processing without clothing detection.
Specifically, the clothing area detection unit 116 outputs “no detection” information (clothing detection no information) to the clothing feature value recording unit 118 when no clothing is detected.

（ステップＳ７１４）
ステップＳ７１４において、着衣特徴量記録部１１８は、着衣検出なし情報記憶処理を行う。
具体的には、着衣特徴量記録部１１８は、着衣領域検出部１１６から「検出なし」の情報を入力された場合には、着衣未検出として特徴量データベース８０１の着衣特徴量列８１３のうち、映像ＩＤの一致する行のセルに記録する。 (Step S714)
In step S714, the clothing feature value recording unit 118 performs information storage processing without clothing detection.
Specifically, when the information “no detection” is input from the clothing area detection unit 116, the clothing feature value recording unit 118 includes the clothing feature value column 813 of the feature value database 801 as undetected clothing. Record in the cell of the row with the matching video ID.

（ステップＳ７１５）
ステップＳ７１５において、撮影時刻記録部１１９は、撮影時刻記憶処理を行う。
具体的には、撮影時刻記録部１１９は、記憶部１１２の特徴量データベース８０１の撮影時刻列８１４に、映像の撮影時刻を記憶する。
この撮影時刻としては、システム中で統一された時系列に基づく時刻であればよい。
すなわち、撮像装置１０２にて映像生成した段階での時刻を使ってもよいし、録画装置で映像受信した段階での時刻を使ってもよいし、ステップＳ７１５に到達した段階での時刻を使ってもよい。
また、上述のように、ＧＰＳ等を用いて正確な時刻を記憶することもできる。 (Step S715)
In step S715, the shooting time recording unit 119 performs shooting time storage processing.
Specifically, the shooting time recording unit 119 stores the shooting time of the video in the shooting time column 814 of the feature amount database 801 of the storage unit 112.
The photographing time may be a time based on a time series unified in the system.
That is, the time at which the image is generated by the imaging apparatus 102 may be used, the time at which the video is received by the recording apparatus may be used, or the time at which the process reaches step S715 is used. Also good.
Further, as described above, accurate time can be stored using GPS or the like.

（ステップＳ７１６）
ステップＳ７１６において、撮影位置記録部１２０は、撮影位置記憶処理を行う。
具体的には、撮影位置記録部１２０は、映像の撮影位置を記憶部１１２の特徴量データベース８０１の撮影位置列８１５に記録する。
この撮影位置記録部１２０で使用する撮影位置は、撮像装置１０２間の相対的な距離の度合がわかるような情報であれば、どのような情報も用いることができる。たとえば、地表平面上のＸ座標、Ｙ座標、Ｚ座標（高度）等を記録しておくことができる。
また、この映像の撮影位置も、上述のようにＧＰＳ等を用いた正確な位置を記憶することができる。
さらに、カメラの相対的な位置を示す符号（又は番号）のようなものを撮影位置の代わりに記録しておいて、実際のカメラの位置は、実際にカメラの物理的な位置を示す情報から算出するようにしてもよい。 (Step S716)
In step S716, the shooting position recording unit 120 performs shooting position storage processing.
Specifically, the shooting position recording unit 120 records the shooting position of the video in the shooting position column 815 of the feature amount database 801 of the storage unit 112.
Any information can be used as the photographing position used in the photographing position recording unit 120 as long as the information indicates the degree of relative distance between the image capturing apparatuses 102. For example, the X coordinate, Y coordinate, Z coordinate (altitude), etc. on the ground plane can be recorded.
In addition, as described above, an accurate position using the GPS or the like can be stored as the shooting position of the video.
Further, a code (or number) indicating the relative position of the camera is recorded instead of the shooting position, and the actual camera position is obtained from information indicating the actual physical position of the camera. You may make it calculate.

（ステップＳ７１７）
ステップＳ７１７において、総合判定部１２１は、受信待ち状態処理を行う。
ここで、総合判定部１２１は、次の映像の受信待ち状態となる。具体的には、総合判定部１２１は、記憶画像データをネットワーク部１１１が受信していない場合は、受信されるまで待機する。
また、総合判定部１２１は、監視端末１０３より録画の終了処理のコマンドを受信したり、図示しない入力部の電源ボタンが押下されたことを検知した場合は、すべての処理を終了して電源をシャットダウンする。 (Step S717)
In step S717, the comprehensive determination unit 121 performs reception waiting state processing.
Here, the comprehensive determination unit 121 enters a state of waiting for reception of the next video. Specifically, when the network unit 111 has not received the stored image data, the comprehensive determination unit 121 stands by until it is received.
Also, when the comprehensive judgment unit 121 receives a recording end processing command from the monitoring terminal 103 or detects that a power button of an input unit (not shown) is pressed, the overall judgment unit 121 ends all processing and turns on the power. Shut down.

なお、ステップＳ７０６、Ｓ７０７において、着衣領域検出部１１６は、顔特徴量記録部１１５からの顔領域座標も入力するようにすることができる。
この場合は、「顔の下に着衣があることが多い」という条件を使って着衣領域検出の精度を向上させることができるという効果が得られる。 In steps S 706 and S 707, the clothing area detection unit 116 can also input the face area coordinates from the face feature amount recording unit 115.
In this case, it is possible to improve the accuracy of detection of the clothing area using the condition that “there is often clothing under the face”.

また、本発明の実施の形態に係る人物検索方法では、ステップＳ７０３において検出される顔数が１個である場合を示した。
しかし、同一映像に複数の人物が映っていて、複数個の顔が検出された場合には、ステップＳ７０４〜Ｓ７１６を、検出数の回数分、繰り返すようにすればよい。 In the person search method according to the embodiment of the present invention, the case where the number of faces detected in step S703 is one is shown.
However, when a plurality of persons are shown in the same video and a plurality of faces are detected, steps S704 to S716 may be repeated as many times as the number of detections.

〔ユーザインタフェイス〕
次に、図４を参照して、本発明の実施の形態に係る人物検索方法をユーザが実行するためのユーザインタフェイス画面について説明する。
本発明の実施の形態に係る録画装置１０１は、上述のように、映像ＩＤ、顔特徴量、着衣特徴量、撮影時刻情報、及び撮影位置情報を記憶している。
この記憶された情報から、監視端末１０３のユーザインタフェイスを用いて、監視端末１０３の制御部がユーザの指示を検知し、この指示をコマンドとしてネットワーク１００を介して録画装置１０１に送信する。このコマンドを受信した録画装置１０１では、人物検索方法を実行することができる。
図４は、監視端末１０３の表示部に表示されるユーザインタフェイスの画面である。 [User interface]
Next, with reference to FIG. 4, a user interface screen for the user to execute the person search method according to the embodiment of the present invention will be described.
As described above, the recording apparatus 101 according to the embodiment of the present invention stores the video ID, the face feature value, the clothing feature value, the shooting time information, and the shooting position information.
From the stored information, the control unit of the monitoring terminal 103 detects a user instruction using the user interface of the monitoring terminal 103, and transmits this instruction to the recording apparatus 101 via the network 100 as a command. The recording apparatus 101 that has received this command can execute the person search method.
FIG. 4 is a user interface screen displayed on the display unit of the monitoring terminal 103.

このユーザインタフェイスにおいては、上述の従来の監視端末２０３と同様に、監視端末１０３の入力部がユーザの入力を検知して、表示部への表示と各種処理を行う。
ここで、符号が同一の領域は、図１０の従来の人物検索機能のユーザインタフェイスと同等の機能を実現するための表示領域である。
本発明の実施の形態に係るユーザインタフェイスにおいては、検索操作部５０５が用いられている点が、従来のユーザインタフェイスと異なっている。
以下で、この検索操作部５０５について、更に詳しく説明する。 In this user interface, like the above-described conventional monitoring terminal 203, the input unit of the monitoring terminal 103 detects a user input and performs display on the display unit and various processes.
Here, the area | region with the same code | symbol is a display area | region for implement | achieving the function equivalent to the user interface of the conventional person search function of FIG.
The user interface according to the embodiment of the present invention is different from the conventional user interface in that the search operation unit 505 is used.
The search operation unit 505 will be described in more detail below.

検索操作部５０５は、ユーザが検索操作部５０５上の検索ボタンを押下することで、検索入力画像指定部３０４に表示された画像にて検索を行うことを指示するための領域である。この検索操作部５０５には、他に、日時やカメラ等の検索の範囲を指定する手段を備えていてもよい（図示しない）。
重み種類選択ボタン５１１〜５１３は、同一人物の判断に使用する重みの種類の選択を与えるためのボタンである。
重みの種類は、それぞれ、撮影時刻情報と着衣情報の関係に基づく重み、撮影時刻情報と撮影位置情報の関係に基づく重み、撮影位置情報と着衣情報の関係に基づく重みの３種類の重みを用いることができる。
重みの種類の選択は複数選択が可能になっており、検索ボタン押下時の選択状況が、以下で説明する検索処理に反映されるようになっている。
重み設定ボタン５１４は、ボタン押下により重み設定画面が起動する。重み設定画面については後述する。 The search operation unit 505 is an area for instructing to perform a search using an image displayed on the search input image specifying unit 304 when the user presses a search button on the search operation unit 505. In addition, the search operation unit 505 may be provided with means for specifying a search range such as date and time or a camera (not shown).
The weight type selection buttons 511 to 513 are buttons for giving selection of the type of weight used for determination of the same person.
Three types of weights are used: a weight based on the relationship between shooting time information and clothing information, a weight based on the relationship between shooting time information and shooting position information, and a weight based on the relationship between shooting position information and clothing information. be able to.
A plurality of types of weights can be selected, and the selection status when the search button is pressed is reflected in the search process described below.
The weight setting button 514 activates a weight setting screen when the button is pressed. The weight setting screen will be described later.

次に図５を参照して、本発明の実施の形態に係る人物検索方法を実行するためのユーザインタフェイスにおける、重み設定画面の一例を示す。図５は、図４に示した重み設定ボタン５１４の押下を検出して、監視端末１０３の制御部が表示部に表示する画面である。 Next, referring to FIG. 5, an example of a weight setting screen in the user interface for executing the person search method according to the embodiment of the present invention is shown. FIG. 5 is a screen that is displayed on the display unit by the control unit of the monitoring terminal 103 when the pressing of the weight setting button 514 shown in FIG. 4 is detected.

時刻＋着衣重み設定表６１０は、撮影時刻情報と着衣情報の２者の関係に基づく重みの設定を与える領域である。
設定値入力欄６１１〜６１４は、重みの設定値を入力する欄である。
重みの設定値は、後述する重みの算出処理で用いる重みの値である。
各重みの設定値としては、例えば、０．０〜２．０までの割合の値を設定可能である。０．０に近い値になると、算出される値が小さくなる。すなわち、０．０で重み付けをしなくなる。
逆に、２．０に近い値になると、算出される値が大きくなる。
このような重みの設定値により調整されて算出された値を適用し、上述の近傍性、すなわち「その人物が同一であるかどうかの同一性」の判断を行う。 The time + clothing weight setting table 610 is an area for giving a weight setting based on the relationship between the shooting time information and the clothing information.
Setting value input fields 611 to 614 are fields for inputting weight setting values.
The weight set value is a weight value used in a weight calculation process described later.
As the set value of each weight, for example, a ratio value from 0.0 to 2.0 can be set. When the value is close to 0.0, the calculated value becomes small. That is, no weighting is performed at 0.0.
Conversely, when the value is close to 2.0, the calculated value increases.
The value calculated by adjusting the weight set value is applied to determine the above-described proximity, that is, “identity of whether or not the person is the same”.

設定値入力欄６１１は、入力画像の撮影時刻と対象画像の撮影時刻が近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が同一である場合の重みの設定値を入力する欄である。
設定値入力欄６１２は、入力画像の撮影時刻と対象画像の撮影時刻が非近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が同一である場合の重みの設定値を入力する欄である。
設定値入力欄６１３は、入力画像の撮影時刻と対象画像の撮影時刻が近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が非同一である場合の重みの設定値を入力する欄である。
設定値入力欄６１４は、入力画像の撮影時刻と対象画像の撮影時刻が非近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が非同一である場合の重みの設定値を、入力する欄である。
以下、この時刻＋着衣重み設定表６１０に入力された重みの設定値群を、時刻＋着衣重み設定値とする。 The set value input field 611 is used to input a set value of weight when the shooting time of the input image and the shooting time of the target image are close, and the clothes of the person of the input image and the clothes of the person of the target image are the same. It is a column to do.
The set value input field 612 indicates the set value of the weight when the shooting time of the input image and the shooting time of the target image are not close, and the clothes of the person of the input image and the clothes of the person of the target image are the same. It is a column to input.
The set value input field 613 indicates a weight setting value when the shooting time of the input image and the shooting time of the target image are close, and the clothes of the person of the input image and the clothes of the person of the target image are not the same. It is a column to input.
The setting value input field 614 is a weight setting value when the shooting time of the input image and the shooting time of the target image are not close, and the clothes of the person of the input image and the clothes of the person of the target image are not the same. Is a column for inputting.
Hereinafter, the set value group of weights input to the time + clothing weight setting table 610 is referred to as time + clothing weight setting value.

同様に、時刻＋位置重み設定表６２０は、撮影時刻情報と撮影位置情報の２者の関係に基づく重みの設定を与える領域である。
設定値入力欄６２１〜６２４は重みの設定値を入力する欄である。この重みの設定値についても、上述の設定値入力欄６１１〜６１４と同様に０．０〜２．０までの値を設定可能である。
設定値入力欄６２１は、入力画像の撮影時刻と対象画像の撮影時刻が近傍であり、かつ、入力画像の撮影位置と対象画像の撮影位置が近傍である場合の重みの設定値を、入力する欄である。
設定値入力欄６２２は、入力画像の撮影時刻と対象画像の撮影時刻が非近傍であり、かつ、入力画像の撮影位置と対象画像の撮影位置が近傍である場合の重みの設定値を、入力する欄である。
設定値入力欄６２３は、入力画像の撮影時刻と対象画像の撮影時刻が近傍であり、かつ、入力画像の撮影位置と対象画像の撮影位置が非近傍である場合の重みの設定値を、入力する欄である。
設定値入力欄６２４は、入力画像の撮影時刻と対象画像の撮影時刻が非近傍であり、かつ、入力画像の撮影位置と対象画像の撮影位置が非近傍である場合の重みの設定値を、入力する欄である。
以下、この時刻＋位置重み設定表６２０に入力された重みの設定値群を、時刻＋位置重み設定値とする。 Similarly, the time + position weight setting table 620 is an area for giving a weight setting based on the relationship between the shooting time information and the shooting position information.
Setting value input columns 621 to 624 are columns for inputting weight setting values. As for the set value of the weight, a value from 0.0 to 2.0 can be set similarly to the above-described set value input fields 611 to 614.
The setting value input field 621 is used to input a weight setting value when the shooting time of the input image and the shooting time of the target image are close, and the shooting position of the input image and the shooting position of the target image are close. It is a column.
The setting value input field 622 is used to input a weight setting value when the shooting time of the input image and the shooting time of the target image are not close and the shooting position of the input image and the shooting position of the target image are close. It is a column to do.
The setting value input field 623 is used to input a weight setting value when the shooting time of the input image and the shooting time of the target image are close, and the shooting position of the input image and the shooting position of the target image are not close. It is a column to do.
The set value input field 624 is a weight setting value when the shooting time of the input image and the shooting time of the target image are not close, and the shooting position of the input image and the shooting position of the target image are not close. It is a column to input.
Hereinafter, the weight set value group input to the time + position weight setting table 620 is referred to as time + position weight set value.

同様に、位置＋着衣重み設定表６３０は、撮影位置情報と着衣情報の２者の関係に基づく重みの設定値を入力する領域である。
この設定値についても、上述の設定値入力欄６１１〜６１４や設定値入力欄６２１〜６２４と同様に０．０〜２．０までの値を設定可能である。
設定値入力欄６３１〜６３４のうち、６３１は入力画像の撮影位置と対象画像の撮影位置が近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が同一である場合の重みの設定値を、入力する欄である。
設定値入力欄６３２は、入力画像の撮影位置と対象画像の撮影位置が非近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が同一である場合の重みの設定値を、入力する欄である。
設定値入力欄６３３は、入力画像の撮影位置と対象画像の撮影位置が近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が非同一である場合の重みの設定値を、入力する欄である。
設定値入力欄６３４は、入力画像の撮影位置と対象画像の撮影位置が非近傍であり、かつ、入力画像の人物の着衣と対象画像の人物の着衣が非同一である場合の重みの設定値を、入力する欄である。
以下、この位置＋着衣重み設定表６３０に入力された重みの設定値群を、位置＋着衣重み設定値とする。 Similarly, the position + clothing weight setting table 630 is an area for inputting a weight setting value based on the relationship between the shooting position information and the clothing information.
As for the set values, values from 0.0 to 2.0 can be set in the same manner as the set value input fields 611 to 614 and the set value input fields 621 to 624 described above.
Among the set value input fields 631 to 634, 631 is a weight when the shooting position of the input image and the shooting position of the target image are close, and the clothes of the person of the input image and the clothes of the person of the target image are the same. This is a column for inputting the set value.
The set value input field 632 indicates the set value of the weight when the shooting position of the input image and the shooting position of the target image are not close and the clothes of the person of the input image and the clothes of the person of the target image are the same. This is a field for input.
The set value input field 633 indicates the set value of the weight when the shooting position of the input image and the shooting position of the target image are close, and the clothes of the person of the input image and the clothes of the person of the target image are not the same. This is a field for input.
The set value input field 634 is a setting value of weight when the shooting position of the input image and the shooting position of the target image are not close, and the clothes of the person of the input image and the clothes of the person of the target image are not the same. Is a column for inputting.
Hereinafter, the weight set value group input to the position + clothing weight setting table 630 is referred to as a position + clothing weight setting value.

適用ボタン６４１は、ユーザが押下することで、各設定表に入力した内容を反映するためのボタンである。
閉じるボタン６４２は、ユーザが押下することで、この設定画面を終了して操作画面に処理を戻すためのボタンである。
監視端末１０３の制御部は、上述の各ボタンをユーザが押下したことを検知して、それぞれの処理を行う。 The apply button 641 is a button for reflecting the contents input to each setting table when pressed by the user.
The close button 642 is a button for ending this setting screen and returning the processing to the operation screen when pressed by the user.
The control unit of the monitoring terminal 103 detects that the user has pressed each button described above, and performs each process.

なお、時刻＋着衣重み設定表６１０、時刻＋位置重み設定表６２０、位置＋着衣重み設定表６３０の各重み設定表においては、撮影時刻情報と撮影位置情報に対して、近傍と非近傍の２区分の入力欄があるように示した。しかし、この区分数を増やしてより細かく設定できるようにしてもよい。
また、本例では各区分を分ける際の閾値を所定の値にしているように説明した。しかし、この区分を分ける閾値に関しても、ユーザが自由に設定できるようにしてもよい。 Note that in the weight setting tables of the time + clothing weight setting table 610, the time + position weight setting table 620, and the position + clothing weight setting table 630, the vicinity 2 and the non-neighbor 2 for the shooting time information and the shooting position information. Shown that there is an input field for the category. However, the number of divisions may be increased so that it can be set more finely.
Further, in this example, the description has been made such that the threshold value for dividing each section is set to a predetermined value. However, the user may be able to freely set the threshold for dividing this division.

〔人物検索処理〕
次に、図６のタイミングチャートを参照して、本発明の実施の形態に係る人物検索方法を適用した録画装置１０１と監視端末１０３とにおける、人物検索処理のネットワーク１００を介した情報の送受信の流れについて説明する。以下の情報の送受信の処理は、録画装置１０１又は監視端末１０３の制御部等と各部が協調して実行される。
監視端末１０３は本処理前に、録画装置１０１からの画像データを入力して、図４に示した表示処理を完了しているものとする。すなわち、検索入力画像指定部３０４には検索入力画像が表示されている状態である。この際に、監視端末１０３は、検索入力画像とこれに係るフレーム番号である映像ＩＤも受信している。
ここで、監視端末１０３は、上述のユーザインタフェイスにおいて、ユーザが検索操作部５０５の検索ボタンを押下したことを検知して、以下の人物検索処理を始める。 [Person search processing]
Next, referring to the timing chart of FIG. 6, information transmission / reception via the network 100 of the person search processing between the recording apparatus 101 and the monitoring terminal 103 to which the person search method according to the embodiment of the present invention is applied. The flow will be described. The following information transmission / reception processing is executed in cooperation with the control unit and the like of the recording device 101 or the monitoring terminal 103.
It is assumed that the monitoring terminal 103 has input the image data from the recording apparatus 101 before this processing and has completed the display processing shown in FIG. That is, the search input image designation unit 304 is in a state where a search input image is displayed. At this time, the monitoring terminal 103 also receives the search input image and the video ID that is the frame number related thereto.
Here, the monitoring terminal 103 detects that the user has pressed the search button of the search operation unit 505 in the above-described user interface, and starts the following person search process.

（タイミングＴ１０１）
タイミングＴ１０１において、監視端末１０３は、検索入力画像送信処理を行う。すなわち、監視端末１０３は、上述の送信ボタンの押下を検知した場合は、検索入力画像指定部３０４に表示された検索入力画像の映像ＩＤを、録画装置１０１に送信する。さらに、チェックボックスの状態等の検索用の付随情報に関しても、録画装置１０１に送信する。
録画装置１０１は、この検索入力画像から、検索前処理として、ステップＳ２０１において、顔画像を検出する。 (Timing T101)
At timing T101, the monitoring terminal 103 performs search input image transmission processing. That is, when the monitoring terminal 103 detects that the transmission button is pressed, the monitoring terminal 103 transmits the video ID of the search input image displayed on the search input image specifying unit 304 to the recording apparatus 101. Further, associated information for search such as the state of a check box is also transmitted to the recording apparatus 101.
In step S201, the recording apparatus 101 detects a face image from the search input image as pre-search processing.

（タイミングＴ１０２）
次に、タイミングＴ１０２において、録画装置１０１は、顔検出なし情報送信処理を行う。
ここでは、録画装置１０１は、検索前処理で顔画像を検出したかを、ステップＳ２０２において判定する。
Ｙｅｓ、すなわち顔画像が検出された場合には、この顔検出なし情報送信処理は行わず、実際の顔特徴量の検索と服装特徴量の検索の処理を行う。
Ｎｏ、すなわち顔画像が検出されなかった場合には、「検出なし」の情報である顔検出なし情報を、監視端末１０３に送信する。 (Timing T102)
Next, at timing T 102, the recording apparatus 101 performs face detection-free information transmission processing.
Here, the recording apparatus 101 determines in step S202 whether a face image has been detected in the search preprocessing.
When Yes, that is, when a face image is detected, this face detection-less information transmission processing is not performed, and actual face feature amount search and clothing feature amount search processing are performed.
If No, that is, if no face image is detected, information on no face detection that is “no detection” information is transmitted to the monitoring terminal 103.

監視端末１０３は、顔検出なし情報が（顔の）「検出なし」であった場合は、「顔が検出できません」等と表示する。 The monitoring terminal 103 displays “no face can be detected” or the like when the no face detection information is “no detection” (for the face).

録画装置１０１は、顔を検出した場合（ステップＳ２０２のＹｅｓ）は、検索入力画像と他の検索操作部５０５の設定に従って、顔特徴量の検索と着衣特徴量の検索の処理を行う（ステップＳ２０３）。
さらに、録画装置１０１は、重みの設定値や算出された重みの値を用いて実際の検索処理を行い、検索対象画像から検索して見つかった画像である検索出力画像を記憶部１１２に記憶して蓄積する（ステップＳ２０４）。 When the recording device 101 detects a face (Yes in step S202), the recording device 101 performs a process of searching for a facial feature value and a search for a clothing feature value according to the search input image and the settings of the other search operation unit 505 (step S203). ).
Further, the recording apparatus 101 performs an actual search process using the set weight value or the calculated weight value, and stores the search output image, which is an image found by searching from the search target image, in the storage unit 112. (Step S204).

（タイミングＴ１０３）
次に、タイミングＴ１０３において、録画装置１０１は、検索出力画像一覧送信処理を行う。
ここでは、録画装置１０１は、出力する検索出力画像の一覧である検索出力画像一覧を監視端末１０３に送信する。 (Timing T103)
Next, at timing T103, the recording apparatus 101 performs search output image list transmission processing.
Here, the recording apparatus 101 transmits a search output image list that is a list of search output images to be output to the monitoring terminal 103.

監視端末１０３は、この検索出力画像を、表示部に表示されたユーザインタフェイスの検索出力画像一覧表示部３０６に一覧表示する。
これにより、人物検索処理を終了する。 The monitoring terminal 103 displays a list of the search output images on the search output image list display unit 306 of the user interface displayed on the display unit.
Thereby, the person search process is completed.

次に、図７と図８Ａ〜Ｄのフローチャートを参照して、録画装置１０１と監視端末１０３における実際の処理について、より具体的に説明する。
まずは、図７を参照して、監視端末１０３の処理について説明する。 Next, the actual processing in the recording device 101 and the monitoring terminal 103 will be described more specifically with reference to the flowcharts of FIG. 7 and FIGS.
First, the process of the monitoring terminal 103 will be described with reference to FIG.

〔監視端末１０３の処理〕
（ステップＳ９０１）
ステップＳ９０１において、監視端末１０３の制御部は、検索ボタン押下検知処理を行う。
具体的には、監視端末１０３の表示部に表示された図４のユーザインタフェイスにおいて、検索操作部５０５の検索ボタンがポインティングデバイス等を用いてユーザに押下されたことを検知する。 [Processing of the monitoring terminal 103]
(Step S901)
In step S901, the control unit of the monitoring terminal 103 performs a search button press detection process.
Specifically, in the user interface of FIG. 4 displayed on the display unit of the monitoring terminal 103, it is detected that the search button of the search operation unit 505 has been pressed by the user using a pointing device or the like.

（ステップＳ９０２）
ステップＳ９０２において、監視端末１０３の制御部は、検索入力画像送信処理を行う。
具体的には、検索入力画像指定部３０４の検索入力画像の画像データに係る映像ＩＤを、ネットワーク１００を介して、録画装置１０１に対して送信する。この送信のタイミングは、Ｔ１０１である。
この画像データの送信により、録画装置１０１に対しては人物検索の実行を要求することとなる。これを、検索実行要求とする。
検索実行要求の際には、更に、検索ボタン押下時の重み種類選択ボタン５１１〜５１３の選択状況を送信する。これらの情報を、検索付随情報とする。
その後、監視端末１０３の制御部は、録画装置１０１からの顔検出なし情報又は検索出力画像一覧を受信するまで待機する。
なお、監視端末１０３の制御部は、顔検出なし情報と検索画像一覧のいずれも、所定時間受信しない場合には、録画装置１０１との接続ができない旨の表示をすることもできる。 (Step S902)
In step S902, the control unit of the monitoring terminal 103 performs a search input image transmission process.
Specifically, the video ID related to the image data of the search input image of the search input image specifying unit 304 is transmitted to the recording apparatus 101 via the network 100. This transmission timing is T101.
By transmitting the image data, the recording apparatus 101 is requested to execute a person search. This is a search execution request.
In the case of a search execution request, the selection status of the weight type selection buttons 511 to 513 when the search button is pressed is further transmitted. These pieces of information are used as search accompanying information.
Thereafter, the control unit of the monitoring terminal 103 waits until receiving no face detection information or a search output image list from the recording apparatus 101.
Note that the control unit of the monitoring terminal 103 can also display that the connection with the recording apparatus 101 is not possible when neither the face detection-free information nor the search image list is received for a predetermined time.

（ステップＳ９０３）
ステップＳ９０３において、監視端末１０３の制御部は、録画装置１０１からの情報を受信し、この情報が顔検出なし情報であるか判定する。
具体的には、Ｔ１０２の顔検出なし情報送信のタイミングで録画装置１０１より送信された顔検出なし情報を受信したか、又は（顔検出なし情報ではなく）検索出力画像一覧を受信したかについて判定する。
上述のように、顔検出なし情報は、録画装置１０１にて顔が検出されなかった「検出なし」を示す情報である。また、検索出力画像一覧は、顔が検出されて各種処理を行った後で録画装置１０１が出力する画像データ等である。
Ｙｅｓ、つまり「検出なし」の場合は、監視端末１０３の制御部は、処理をステップＳ９０６に進める。
Ｎｏ、すなわち検索出力画像であった場合は、監視端末１０３の制御部は、処理をステップＳ９０４に進める。 (Step S903)
In step S903, the control unit of the monitoring terminal 103 receives information from the recording apparatus 101, and determines whether this information is information without face detection.
Specifically, it is determined whether the face detection-free information transmitted from the recording apparatus 101 is received at the timing of transmitting the face-detection-free information in T102, or whether the search output image list is received (not the information without face detection). To do.
As described above, the no face detection information is information indicating “no detection” in which no face is detected by the recording apparatus 101. The search output image list is image data output from the recording apparatus 101 after a face is detected and various processes are performed.
If Yes, that is, “no detection”, the control unit of the monitoring terminal 103 advances the process to step S906.
If the result is No, that is, the search output image, the control unit of the monitoring terminal 103 advances the process to step S904.

（ステップＳ９０４）
ステップＳ９０４において、監視端末１０３の制御部は、検索出力画像一覧受信処理を行う。
具体的には、監視端末１０３は、タイミングＴ１０３で録画装置１０１から送信された検索出力画像一覧を、ネットワーク１００を介して受信する。 (Step S904)
In step S904, the control unit of the monitoring terminal 103 performs a search output image list reception process.
Specifically, the monitoring terminal 103 receives the search output image list transmitted from the recording apparatus 101 at the timing T103 via the network 100.

（ステップＳ９０５）
ステップＳ９０５において、監視端末１０３の制御部は、サムネイル一覧表示処理を行う。
具体的には、受信内容を図４に示した検索出力画像一覧表示部３０６に、サムネイル表示する。
本発明の実施の形態の人物検索方法では、着衣情報や重み付けを用いて条件の絞り込みができるため、従来の人物検索方法よりもこの検索画像を少なく得る（擬陽性、ｆａｌｓｅｐｏｓｉｔｉｖｅが少ない）ことが期待できる。
これにより、ユーザの監視負担を減らすことができるという効果が得られる。 (Step S905)
In step S905, the control unit of the monitoring terminal 103 performs thumbnail list display processing.
Specifically, the received contents are displayed as thumbnails on the search output image list display unit 306 shown in FIG.
In the person search method according to the embodiment of the present invention, conditions can be narrowed down using clothing information and weighting, so that it is expected that the number of search images can be reduced (false positives and false positives are less) than the conventional person search method. it can.
Thereby, the effect that the monitoring burden of a user can be reduced is acquired.

（ステップＳ９０６）
ステップＳ９０６において、監視端末１０３の制御部は、顔検出なし情報を受信した情報を受信した場合、「検出なし」表示処理を行う。
具体的には、監視端末１０３の制御部は、検索入力画像には顔が検出できなかった旨を、図４の検索出力画像一覧表示部３０６等にメッセージ表示する。 (Step S906)
In step S 906, the control unit of the monitoring terminal 103 performs “no detection” display processing when the information that has received no face detection information is received.
Specifically, the control unit of the monitoring terminal 103 displays a message on the search output image list display unit 306 in FIG. 4 and the like indicating that no face has been detected in the search input image.

（ステップＳ９０７）
ステップＳ９０７において、監視端末１０３の制御部は、ユーザ操作待ち処理を行う。
具体的には、監視端末１０３の制御部は、表示部のユーザインタフェイスの表示・更新等の処理を行い、検索操作部５０５に関して次のユーザ操作待ち状態となる。
以上で、監視端末１０３の処理を終了する。 (Step S907)
In step S907, the control unit of the monitoring terminal 103 performs a user operation waiting process.
Specifically, the control unit of the monitoring terminal 103 performs processing such as display / update of the user interface of the display unit, and enters the waiting state for the next user operation with respect to the search operation unit 505.
Above, the process of the monitoring terminal 103 is complete | finished.

次に、図８Ａ〜図８Ｄを参照して、録画装置１０１の行う人物検索処理についてより詳しく説明する。 Next, the person search process performed by the recording apparatus 101 will be described in more detail with reference to FIGS. 8A to 8D.

〔録画装置１０１の人物検索処理〕
（ステップＳ９１１）
まず、ステップＳ９１１において、録画装置１０１のネットワーク部１１１は、検索入力画像受信処理を行う。
具体的には、タイミングＴ１０１で監視端末１０３から送信された検索入力画像（検索入力画像指定部３０４の画像）の画像データに係る映像ＩＤを、録画装置１０１のネットワーク部１１１で受信する。すなわち、検索実行要求を受信する。
ここで、更に、ネットワーク部１１１は、重み種類選択ボタン５１１〜５１３の選択状況等である検索付随情報についても受信する。
ネットワーク部１１１は、受信された検索入力画像の画像データと付随情報を、総合判定部１２１へ出力する。 [Person Search Processing of Recording Apparatus 101]
(Step S911)
First, in step S911, the network unit 111 of the recording apparatus 101 performs a search input image reception process.
Specifically, the video ID related to the image data of the search input image (image of the search input image specifying unit 304) transmitted from the monitoring terminal 103 at timing T101 is received by the network unit 111 of the recording apparatus 101. That is, a search execution request is received.
Here, the network unit 111 also receives search-accompanying information such as selection statuses of the weight type selection buttons 511 to 513.
The network unit 111 outputs the image data and accompanying information of the received search input image to the comprehensive determination unit 121.

（ステップＳ９１２）
ステップＳ９１２において、総合判定部１２１は、検索付随情報記憶処理を行う。
具体的には、総合判定部１２１は、検索付随情報のうち、重み種類選択ボタン５１１〜５１３の選択状況と検索入力画像の映像ＩＤを総合判定部１２１の一時的な記憶手段に記憶し、更に記憶部１１２へ出力する。 (Step S912)
In step S912, the comprehensive determination unit 121 performs a search associated information storage process.
Specifically, the comprehensive determination unit 121 stores the selection status of the weight type selection buttons 511 to 513 and the video ID of the search input image in the search accompanying information in the temporary storage unit of the comprehensive determination unit 121, and The data is output to the storage unit 112.

（ステップＳ９１３）
ステップＳ９１３において、記憶部１１２は、画像データ検索処理を行う。
具体的には、記憶部１１２は、入力された映像ＩＤから対応する画像、すなわち検索入力画像の画像データを記録媒体から読み出し、総合判定部１２１へ出力する。 (Step S913)
In step S913, the storage unit 112 performs image data search processing.
Specifically, the storage unit 112 reads the corresponding image from the input video ID, that is, the image data of the search input image from the recording medium, and outputs it to the comprehensive determination unit 121.

（ステップＳ９１４）
ステップＳ９１４において、総合判定部１２１は、画像データ出力処理を行う。
具体的には、総合判定部１２１は、検索入力画像の画像データを一時的な記憶手段に記憶するとともに、顔領域検出部１１３へ出力する。 (Step S914)
In step S 914, the comprehensive determination unit 121 performs image data output processing.
Specifically, the comprehensive determination unit 121 stores the image data of the search input image in a temporary storage unit and outputs it to the face area detection unit 113.

（ステップＳ９１５）
ステップＳ９１５において、顔領域検出部１１３は、顔検出処理を行う。
具体手的には、顔領域検出部１１３は入力された検索入力画像の画像データに対し顔検出処理を行う。 (Step S915)
In step S915, the face area detection unit 113 performs face detection processing.
Specifically, the face area detection unit 113 performs face detection processing on the image data of the input search input image.

（ステップＳ９１６）
ステップＳ９１６において、総合判定部１２１は、顔が検出されたかについて判定する。
ここでは、ステップＳ７０４と同様に閾値を用いて判定する。
Ｙｅｓの場合、すなわち顔が検出された場合には、総合判定部１２１は、処理をステップＳ９１９に進める。
Ｎｏの場合、すなわち顔が検出されなかった場合には、総合判定部１２１は、処理をステップＳ９１７に進める。 (Step S916)
In step S916, the comprehensive determination unit 121 determines whether a face has been detected.
Here, the determination is made using the threshold value as in step S704.
In the case of Yes, that is, when a face is detected, the overall determination unit 121 advances the process to step S919.
In the case of No, that is, when the face is not detected, the comprehensive determination unit 121 advances the process to step S917.

（ステップＳ９１７）
ステップＳ９１７において、総合判定部１２１は、「検出なし」の情報を総合判定部１２１へ出力する。 (Step S917)
In step S 917, the comprehensive determination unit 121 outputs “no detection” information to the comprehensive determination unit 121.

（ステップＳ９１８）
ステップＳ９１８において、総合判定部１２１は、「検出なし」情報送信処理を行う。
具体的には、総合判定部１２１は、タイミングＴ１０２において顔検出なし情報をネットワーク部１１１から送信する。
この顔検出なし情報は、上述のように「検出なし」の情報であり、ネットワーク１００を介して、監視端末１０３が受信する。
この後、総合判定部１２１は、処理をステップＳ９６９に進める。 (Step S918)
In step S918, the comprehensive determination unit 121 performs “no detection” information transmission processing.
Specifically, the comprehensive determination unit 121 transmits no face detection information from the network unit 111 at timing T102.
This face detection absence information is “no detection” information as described above, and is received by the monitoring terminal 103 via the network 100.
Thereafter, the overall determination unit 121 proceeds with the process to step S969.

（ステップＳ９１９）
ステップＳ９１９において、総合判定部１２１は、顔領域座標算出処理を行う。
すなわち、ステップＳ７０５と同様に、顔の領域座標を算出し、算出した座標値を画像データとともに顔特徴量抽出部１１４へ出力する。 (Step S919)
In step S919, the comprehensive determination unit 121 performs face area coordinate calculation processing.
That is, as in step S705, the face area coordinates are calculated, and the calculated coordinate values are output to the face feature quantity extraction unit 114 together with the image data.

なお、タイミングＴ１０２の顔検出なし情報送信の代わりに「検出あり」情報をネットワーク部１１１から送信してもよい。
これにより、監視端末１０３がこの「検出あり」情報を受信すると、監視端末１０３の制御部は、ユーザインタフェイスの検索出力画像一覧表示部３０６に「検索中」といった表示を描画することができる。よって、ユーザが検索結果を待つ際のいらいら感を減じることが可能になる。 Note that “detected” information may be transmitted from the network unit 111 instead of transmitting information without face detection at timing T102.
Thereby, when the monitoring terminal 103 receives the “with detection” information, the control unit of the monitoring terminal 103 can draw a display such as “searching” on the search output image list display unit 306 of the user interface. Therefore, it is possible to reduce annoyance when the user waits for the search result.

（ステップＳ９２０）
ステップＳ９２０において、顔特徴量抽出部１１４は、顔特徴量算出処理を行う。
すなわち、ステップＳ７０６と同様に、顔特徴量抽出部１１４は入力された画像と顔領域座標から得られる顔領域画像に対して顔特徴量を算出する。そして、算出した顔特徴量を、総合判定部１２１へ出力する。 (Step S920)
In step S920, the face feature amount extraction unit 114 performs a face feature amount calculation process.
That is, as in step S706, the face feature quantity extraction unit 114 calculates a face feature quantity for a face area image obtained from the input image and face area coordinates. Then, the calculated face feature amount is output to the comprehensive determination unit 121.

（ステップＳ９２１）
ステップＳ９２１において、総合判定部１２１は、顔特徴量記憶処理を行う。
具体的には、総合判定部１２１は、入力された顔特徴量を記憶部１１２の記憶媒体に記憶する。 (Step S921)
In step S921, the comprehensive determination unit 121 performs a face feature amount storage process.
Specifically, the comprehensive determination unit 121 stores the input face feature amount in a storage medium of the storage unit 112.

（ステップＳ９２２）
ステップＳ９２２において、総合判定部１２１は、重み種類選択ボタン５１１及び／又は５１３が選択されているか判定する。
具体的には、総合判定部１２１は、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、重み種類選択ボタン５１１及び／又は５１３が選択されているかについて判定する。
Ｙｅｓの場合は、総合判定部１２１は、ステップＳ９２３に処理を進める。
Ｎｏの場合は、総合判定部１２１は、ステップＳ９３１に処理を進める。 (Step S922)
In step S922, the overall determination unit 121 determines whether the weight type selection button 511 and / or 513 is selected.
Specifically, the overall determination unit 121 determines whether the weight type selection buttons 511 and / or 513 are selected in the selection status of the weight type selection buttons 511 to 513 input in step S911.
In the case of Yes, the comprehensive determination unit 121 proceeds with the process to step S923.
In No, the comprehensive determination part 121 advances a process to step S931.

（ステップＳ９２３）
ステップＳ９２３において、総合判定部１２１は、検索入力画像の画像データを着衣領域検出部１１６へ出力する。 (Step S923)
In step S 923, the overall determination unit 121 outputs the image data of the search input image to the clothing region detection unit 116.

（ステップＳ９２４）
ステップＳ９２４において、着衣領域検出部１１６は、検索入力画像の画像データ対して、着衣検出処理を行う。この着衣検出処理は、ステップＳ７０８と同様に行う。 (Step S924)
In step S924, the clothing region detection unit 116 performs clothing detection processing on the image data of the search input image. This clothing detection process is performed in the same manner as in step S708.

（ステップＳ９２５）
ステップＳ９２５において、総合判定部１２１は、着衣が検出されたかを判定する。この判定は、ステップＳ７０９と同様に行う。
Ｙｅｓの場合は、総合判定部１２１は処理をステップＳ９２６に進める。
Ｎｏの場合は、総合判定部１２１は処理をステップＳ９２９に進める。 (Step S925)
In step S925, the overall determination unit 121 determines whether clothing has been detected. This determination is performed in the same manner as in step S709.
In the case of Yes, the comprehensive determination unit 121 proceeds with the process to step S926.
In No, the comprehensive determination part 121 advances a process to step S929.

（ステップＳ９２６）
ステップＳ９２６において、着衣領域検出部１１６は、検索入力画像の画像データ対して、着衣領域座標算出処理を行う。
この処理は、ステップＳ７１０と同様に行う。すなわち、着衣の領域座標を算出して、算出した座標値を画像とともに着衣特徴量抽出部１１７へ出力する。 (Step S926)
In step S926, the clothing region detection unit 116 performs a clothing region coordinate calculation process on the image data of the search input image.
This process is performed similarly to step S710. That is, the clothing region coordinates are calculated, and the calculated coordinate values are output to the clothing feature value extraction unit 117 together with the image.

（ステップＳ９２７）
ステップＳ９２７において、着衣特徴量抽出部１１７は、検索入力画像の画像データ対して、着衣特徴量算出処理を行う。
この処理は、ステップＳ７１１と同様に行う。すなわち、着衣特徴量抽出部１１７は、検索入力画像の画像データと着衣の領域座標から得られる着衣領域画像に対して、着衣特徴量を算出し、算出した着衣特徴量を総合判定部１２１へ出力する。
これにより、総合判定部１２１は、着衣を検出したかについての情報である着衣検出情報を「検出あり」として、一次的な記憶手段に記憶する。 (Step S927)
In step S927, the clothing feature value extraction unit 117 performs a clothing feature value calculation process on the image data of the search input image.
This process is performed similarly to step S711. That is, the clothing feature amount extraction unit 117 calculates a clothing feature amount for the clothing region image obtained from the image data of the search input image and the clothing region coordinates, and outputs the calculated clothing feature amount to the comprehensive determination unit 121. To do.
Thereby, the comprehensive determination part 121 memorize | stores the clothing detection information which is the information about whether the clothing was detected in a primary memory | storage means as "with detection."

（ステップＳ９２８）
ステップＳ９２８において、総合判定部１２１は、着衣特徴量記憶処理を行う。
具体的には、総合判定部１２１は、入力された着衣特徴量を、一時的な記憶手段に記憶する。
この処理の後、総合判定部１２１は、処理をステップＳ９３１に進める。 (Step S928)
In step S928, the overall determination unit 121 performs a clothing feature amount storage process.
Specifically, the comprehensive determination unit 121 stores the input clothing feature value in a temporary storage unit.
After this process, the overall determination unit 121 advances the process to step S931.

（ステップＳ９２９）
ステップＳ９２９において、着衣領域検出部１１６は、着衣が検出されなかった場合には、「検出なし」の情報を総合判定部１２１へ出力する。 (Step S929)
In step S 929, the clothing area detection unit 116 outputs “no detection” information to the overall determination unit 121 when no clothing is detected.

（ステップＳ９３０）
ステップＳ９３０において、総合判定部１２１は、着衣領域検出部１１６から「検出なし」の情報を入力された場合には、着衣情報を着衣未検出として一時的な記憶手段に記憶する。 (Step S930)
In step S930, when the information of “no detection” is input from the clothing region detection unit 116, the overall determination unit 121 stores the clothing information in the temporary storage unit as no clothing detected.

（ステップＳ９３１）
ステップＳ９３１において、総合判定部１２１は、重み種類選択ボタン５１１及び／又は５１２が選択されているか判定する。
具体的には、ステップＳ９１１で入力された重み種類選択ボタン５００〜５１３の選択状況において、重み種類選択ボタン５１１及び／又は５１２が選択されているかを判定する。
Ｙｅｓの場合、総合判定部１２１は、処理をステップＳ９３２に進める。
Ｎｏの場合、総合判定部１２１は、処理をステップＳ９３５に進める。 (Step S931)
In step S931, the overall determination unit 121 determines whether the weight type selection button 511 and / or 512 is selected.
Specifically, it is determined whether the weight type selection buttons 511 and / or 512 are selected in the selection status of the weight type selection buttons 500 to 513 input in step S911.
In Yes, comprehensive judgment part 121 advances processing to Step S932.
In No, the comprehensive determination part 121 advances a process to step S935.

（ステップＳ９３２）
ステップＳ９３２において、総合判定部１２１は、検索入力画像の映像ＩＤを撮影時刻記録部１１９へ出力する。 (Step S932)
In step S932, the comprehensive determination unit 121 outputs the video ID of the search input image to the shooting time recording unit 119.

（ステップＳ９３３）
ステップＳ９３３において、撮影時刻記録部１１９は、撮影時刻出力処理を行う。
具体的には、撮影時刻記録部１１９は、入力された映像ＩＤから対応する時刻、すなわち検索入力画像の撮影時刻を、記憶部１１２の記録媒体から読み出し、総合判定部１２１へ出力する。 (Step S933)
In step S933, the shooting time recording unit 119 performs shooting time output processing.
Specifically, the shooting time recording unit 119 reads the corresponding time from the input video ID, that is, the shooting time of the search input image from the recording medium of the storage unit 112 and outputs it to the comprehensive determination unit 121.

（ステップＳ９３４）
ステップＳ９３４において、総合判定部１２１は、入力された時刻を一時的な記憶手段に記憶する。 (Step S934)
In step S934, the comprehensive determination unit 121 stores the input time in a temporary storage unit.

（ステップＳ９３５）
ステップＳ９３５において、総合判定部１２１は、重み種類選択ボタン５１２及び／又は５１３が選択されているか判定する。
具体的には、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、重み種類選択ボタン５１２及び／又は５１３が選択されているかを判定する。
Ｙｅｓの場合、総合判定部１２１は、処理をステップＳ９３６に進める。
Ｎｏの場合、総合判定部１２１は、処理をステップＳ９３９に進める。 (Step S935)
In step S935, the overall determination unit 121 determines whether the weight type selection button 512 and / or 513 is selected.
Specifically, it is determined whether the weight type selection buttons 512 and / or 513 are selected in the selection status of the weight type selection buttons 511 to 513 input in step S911.
In Yes, comprehensive judgment part 121 advances processing to Step S936.
In No, the comprehensive determination part 121 advances a process to step S939.

（ステップＳ９３６）
ステップＳ９３６において、総合判定部１２１は、検索入力画像の映像ＩＤを撮影位置記録部１２０へ出力する。 (Step S936)
In step S936, the comprehensive determination unit 121 outputs the video ID of the search input image to the shooting position recording unit 120.

（ステップＳ９３７）
ステップＳ９３７において、撮影位置出力処理を行う。
具体的には、撮影位置記録部１２０は、入力された映像ＩＤから対応する位置、すなわち検索入力画像の撮影位置を、記憶部１１２の記録媒体から読み出し、総合判定部１２１へ出力する。 (Step S937)
In step S937, shooting position output processing is performed.
Specifically, the shooting position recording unit 120 reads the corresponding position from the input video ID, that is, the shooting position of the search input image from the recording medium of the storage unit 112, and outputs it to the comprehensive determination unit 121.

（ステップＳ９３８）
ステップＳ９３８において、総合判定部１２１は、入力された位置を一時的な記憶手段に記憶する。 (Step S938)
In step S938, the comprehensive determination unit 121 stores the input position in a temporary storage unit.

（ステップＳ９３９）
ステップＳ９３９において、総合判定部１２１は、記憶部１１２の記録媒体から、特徴量データベース８０１を読み出すための、データベース読み出し準備処理を行う。
具体的には、総合判定部１２１は、データベースのデーモン又はサービスにアクセスして、特徴量データベース８０１を使用するための各種準備のコマンドを送信する。 (Step S939)
In step S939, the comprehensive determination unit 121 performs a database read preparation process for reading the feature amount database 801 from the recording medium of the storage unit 112.
Specifically, the comprehensive determination unit 121 accesses a database daemon or service, and transmits various preparation commands for using the feature amount database 801.

（ステップＳ９４０）
ステップＳ９４０において、総合判定部１２１は、特徴量データベース８０１の先頭行を参照行にセットする。
参照行は、総合判定部１２１の一時的な記憶手段に記憶された変数等であり、特徴量データベース８０１の表を１行づつ読み出すための、行番号を示すカウンタである。 (Step S940)
In step S940, the comprehensive determination unit 121 sets the first row of the feature amount database 801 as a reference row.
The reference row is a variable or the like stored in a temporary storage unit of the comprehensive determination unit 121, and is a counter indicating a row number for reading the table of the feature amount database 801 one by one.

（ステップＳ９４１）
次に、ステップＳ９４１において、総合判定部１２１は、参照行から映像ＩＤや顔特徴量、着衣特徴量、撮影時刻、撮影位置を読み出して取得する。 (Step S941)
Next, in step S941, the comprehensive determination unit 121 reads and acquires the video ID, the facial feature value, the clothing feature value, the shooting time, and the shooting position from the reference row.

（ステップＳ９４２）
次に、ステップＳ９４２において、総合判定部１２１は、顔類似度算出処理を行う。
具体的には、総合判定部１２１は、ステップＳ９４１で取得した顔特徴量と記憶手段に記憶されている検索入力画像の顔特徴量から、この映像ＩＤをもつ検索対象画像と検索入力画像との顔類似度を算出する。 (Step S942)
Next, in step S942, the comprehensive determination unit 121 performs face similarity calculation processing.
Specifically, the comprehensive determination unit 121 determines the search target image having the video ID and the search input image from the face feature value acquired in step S941 and the face feature value of the search input image stored in the storage unit. The face similarity is calculated.

この顔類似度の算出は、顔特徴量数がｍ個である場合は、例えば以下の式のようにして求められる。 The calculation of the face similarity is obtained, for example, by the following equation when the number of face feature amounts is m.

（Σは、合計を表す）。
ここで、ＦＷｉは顔特徴量の重要度を示す係数であり、

ＦＷ_i＝｛ｆｗ₁, ｆｗ₂，…，ｆｗ_m｝

で表わされる。
ＦＦＩ_iは検索入力画像の顔特徴量であり、

ＦＦＩ_i＝｛ｆｆｉ₁, ｆｆｉ₂，…，ｆｆｉ_m｝

で表わされる。
ＦＦＴ_iは検索対象画像の顔特徴量であり、

ＦＦＴ_i＝｛ｆｆｔ₁, ｆｆｔ₂，…，ｆｆｔ_m｝

で表わされる。
すなわち、このステップにより、顔特徴量間の距離を計算して顔類似度を算出している。
なお、このような方法以外にも、特徴量の分布関数を用いた統計的な手法や、サポート・ベクター・マシン等の人工知能的な手法を用いて顔類似度を算出することが可能である。 (Σ represents the total).
Here, FWi is a coefficient indicating the importance of the facial feature amount,

FW _i = {fw ₁ , fw ₂ ,..., Fw _m }

It is represented by
FFI _i is the facial feature value of the search input image,

FFI _i = {ffi ₁ , fi ₂ ,..., Fi _m }

It is represented by
FFT _i is the facial feature quantity of the search target image,

FFT _i = {fft ₁ , fft ₂ ,..., Fft _m }

It is represented by
That is, in this step, the face similarity is calculated by calculating the distance between the face feature amounts.
In addition to such a method, it is possible to calculate the face similarity using a statistical method using a distribution function of feature amount or an artificial intelligence method such as support vector machine. .

（ステップＳ９４３）
次に、ステップＳ９４３において、総合判定部１２１は、重み種類選択ボタン５１１が選択されているか判定する。
具体的には、総合判定部１２１は、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、重み種類選択ボタン５１１が選択されているかを判定する。加えて、着衣情報が着衣未検出の状態であるかについても判定する。
Ｙｅｓの場合、すなわち、重み種類選択ボタン５１１が選択されており、着衣未検出の状態でない場合は、総合判定部１２１は、処理をステップＳ９４４に進める。
Ｎｏの場合、すなわち、それ以外の場合は、総合判定部１２１は、処理をステップＳ９４８に進める。 (Step S943)
Next, in step S943, the overall determination unit 121 determines whether the weight type selection button 511 is selected.
Specifically, the overall determination unit 121 determines whether the weight type selection button 511 is selected in the selection status of the weight type selection buttons 511 to 513 input in step S911. In addition, it is also determined whether the clothing information is in a state where clothing is not detected.
In the case of Yes, that is, when the weight type selection button 511 is selected and the clothing is not detected, the comprehensive determination unit 121 advances the process to step S944.
In No, ie, other than that, the comprehensive determination part 121 advances a process to step S948.

（ステップＳ９４４）
このステップＳ９４４〜Ｓ９４７において、総合判定部１２１は、撮影時刻情報と着衣情報の２者の関係に基づく重みの算出を行う。
まず、ステップＳ９４４において、総合判定部１２１は、着衣類似度算出処理を行う
具体的には、総合判定部１２１は、ステップＳ９４１で取得した着衣特徴量と、一時的な記憶手段に記憶されている検索入力画像の着衣特徴量とを比較し、この映像ＩＤを有する検索対象画像と検索入力画像と着衣類似度を算出する。
着衣類似度の算出は、着衣特徴量の数がｎ個である場合、以下の式で求められる。 (Step S944)
In steps S944 to S947, the overall determination unit 121 calculates a weight based on the relationship between the shooting time information and the clothing information.
First, in step S944, the comprehensive determination unit 121 performs a clothing similarity calculation process. Specifically, the comprehensive determination unit 121 is stored in the clothing feature amount acquired in step S941 and a temporary storage unit. The clothing feature quantity of the search input image is compared, and the search target image having this video ID, the search input image, and the clothing similarity are calculated.
The calculation of clothing similarity is obtained by the following equation when the number of clothing feature values is n.

ここでＣＷｉは着衣特徴量の重要度を示す係数であり、
ＣＷ_i＝｛ｃｗ₁, ｃｗ₂，…，ｃｗ_n｝
で表わされる。
ＣＦＩ_iは検索入力画像の着衣特徴量であり、
ＣＦＩ_i＝｛ｃｆｉ₁, ｃｆｉ₂，…，ｃｆｉ_n｝
で表わされる。
ＣＦＴ_iは検索対象画像の着衣特徴量であり、
ＣＦＴ_i＝｛ｃｆｔ₁, ｃｆｔ₂，…，ｃｆｔ_n｝
で表わされる。
なお、この着衣類似度についても、特徴量間の距離を算出している。
この着衣類似度に関しても、特徴量の分布関数を用いたり、サポート・ベクター・マシン等の人工知能的な手法や、統計的な手法を用いて算出することが可能である。 Here, CWi is a coefficient indicating the importance of the clothing feature value,
CW _i = {cw ₁ , cw ₂ ,..., Cw _n }
It is represented by
CFI _i is the clothing feature of the search input image,
CFI _i = {cfi ₁ , cfi ₂ ,..., Cfi _n }
It is represented by
CFT _i is the clothing feature of the search target image,
CFT _i = {cft ₁ , cft ₂ ,..., Cft _n }
It is represented by
Note that the distance between feature amounts is also calculated for this clothing similarity.
This clothing similarity can also be calculated using a distribution function of feature quantities, an artificial intelligence technique such as a support vector machine, or a statistical technique.

（ステップＳ９４５）
ステップＳ９４５において、総合判定部１２１は、算出した着衣類似度を所定の閾値と比較して、着衣の同一性について判断する。
この閾値は、使用する着衣特徴量により異なるため、最適なパラメータを予め指定することができる。
判定結果は、一時的な記憶手段に記憶する。 (Step S945)
In step S945, the overall determination unit 121 compares the calculated clothing similarity with a predetermined threshold value and determines the clothing identity.
Since this threshold value varies depending on the clothing feature to be used, an optimum parameter can be designated in advance.
The determination result is stored in a temporary storage means.

（ステップＳ９４６）
ステップＳ９４６において、総合判定部１２１は、時刻の近傍性を所定の閾値と比較して判断する。
具体的には、総合判定部１２１は、ステップＳ９４１で取得した撮影時刻と記憶手段に記憶されている検索入力画像の撮影時刻と検索対象画像の撮影時刻との差の絶対値を、時刻の近傍性に係る所定の閾値と比較して判断する。
この所定の閾値としては、例えば、人が歩行したりタクシー等で移動する速度を基にして設定することができる。
この判定結果は、総合判定部１２１は、一時的な記憶手段に記憶する。 (Step S946)
In step S946, the comprehensive determination unit 121 determines the time proximity by comparing it with a predetermined threshold.
Specifically, the comprehensive determination unit 121 determines the absolute value of the difference between the shooting time acquired in step S941 and the shooting time of the search input image stored in the storage unit and the shooting time of the search target image in the vicinity of the time. Judgment is made by comparison with a predetermined threshold related to sex.
The predetermined threshold can be set based on the speed at which a person walks or moves by taxi, for example.
The determination result is stored in the temporary storage unit by the comprehensive determination unit 121.

（ステップＳ９４７）
ステップＳ９４７において、総合判定部１２１は、撮影時刻情報と着衣情報の２者の関係に基づく重みの算出処理を行う。
この算出処理においては、ユーザが時刻＋着衣重み設定表６１０の設定値入力欄６１１〜６１４に入力した値を用いて行う。
具体的には、総合判定部１２１は、時刻の近傍性の判定結果と、図５の時刻＋着衣重み設定表６１０でユーザが入力した値とを掛ける等の処理を行って重みの算出を行う。
この処理の後、総合判定部１２１は、処理をステップＳ９５０へ進める。 (Step S947)
In step S947, the comprehensive determination unit 121 performs a weight calculation process based on the relationship between the shooting time information and the clothing information.
This calculation process is performed using values input by the user in the setting value input fields 611 to 614 of the time + clothing weight setting table 610.
Specifically, the overall determination unit 121 calculates a weight by performing a process such as multiplying the determination result of the proximity of time by the value input by the user in the time + clothing weight setting table 610 in FIG. .
After this process, the overall determination unit 121 advances the process to step S950.

（ステップＳ９４８）
ステップＳ９４８において、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、重み種類選択ボタン５１１が選択されていない場合、総合判定部１２１は撮影時刻情報と着衣情報の２者の関係に基づく重みの値を１とする。
また、総合判定部１２１は、着衣情報が着衣未検出の状態であった場合にも、同様に重みの値を１とする。 (Step S948)
In step S948, when the weight type selection button 511 is not selected in the selection status of the weight type selection buttons 511 to 513 input in step S911, the comprehensive determination unit 121 has a relationship between the shooting time information and the clothing information. The weight value based on is set to 1.
The overall determination unit 121 similarly sets the weight value to 1 even when the clothing information is in a state in which clothing is not detected.

（ステップＳ９４９）
ステップＳ９４９において、総合判定部１２１は、重み種類選択ボタン５１２が選択されていたかを判定する。
具体的には、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、ユーザにより重み種類選択ボタン５１２が選択されていたかを判定する。
Ｙｅｓの場合は、総合判定部１２１は、処理をステップＳ９５０へ進める。
Ｎｏの場合は、総合判定部１２１は、処理をステップＳ９５３へ進める。 (Step S949)
In step S949, the overall determination unit 121 determines whether the weight type selection button 512 has been selected.
Specifically, it is determined whether or not the weight type selection button 512 has been selected by the user in the selection status of the weight type selection buttons 511 to 513 input in step S911.
In Yes, comprehensive judgment part 121 advances processing to Step S950.
In No, the comprehensive determination part 121 advances a process to step S953.

（ステップＳ９５０）
以下、ステップＳ９５０〜Ｓ９５２において、総合判定部１２１は、まず、撮影時刻情報と撮影位置情報の２者の関係に基づく重みの算出を行う。
まず、ステップＳ９５０において、総合判定部１２１は、時刻の近傍性判断処理を行う。
具体的には、総合判定部１２１は、ステップＳ９４１で取得した撮影時刻と、一時的な記憶手段に記憶されている検索入力画像の撮影時刻から、所定の閾値と比較して、時刻の近傍性を判断する。
すなわち、検索対象画像と検索入力画像の撮影時刻との差の値を、所定の閾値と比較して、時刻の近傍性を判断する。
判断した結果は、総合判定部１２１の一次的な記憶手段に記憶する。 (Step S950)
Hereinafter, in steps S950 to S952, the comprehensive determination unit 121 first calculates a weight based on the relationship between the shooting time information and the shooting position information.
First, in step S950, the comprehensive determination unit 121 performs time proximity determination processing.
Specifically, the comprehensive determination unit 121 compares the time of proximity with a predetermined threshold from the shooting time acquired in step S941 and the shooting time of the search input image stored in the temporary storage unit. Judging.
That is, the proximity of the time is determined by comparing the value of the difference between the search target image and the shooting time of the search input image with a predetermined threshold.
The determination result is stored in the primary storage unit of the comprehensive determination unit 121.

（ステップＳ９５１）
ステップＳ９５１において、総合判定部１２１は、位置の近傍性判断処理を行う。
ステップＳ９４３において、総合判定部１２１は、ステップＳ９４１で取得した撮影位置と記憶手段に記憶されている検索入力画像の撮影位置から位置の近傍性を予め設定した閾値と比較して判断する。
すなわち、総合判定部１２１は、検索対象画像と検索入力画像の撮影位置の差を所定の閾値と比較して、位置の近傍性を判断する。
判断した結果は、同様に、総合判定部１２１の一次的な記憶手段に記憶する。 (Step S951)
In step S951, the comprehensive determination unit 121 performs position proximity determination processing.
In step S943, the comprehensive determination unit 121 determines the proximity of the position from the shooting position acquired in step S941 and the shooting position of the search input image stored in the storage unit by comparing with a preset threshold value.
That is, the comprehensive determination unit 121 determines the proximity of the position by comparing the difference between the shooting positions of the search target image and the search input image with a predetermined threshold.
The result of the determination is similarly stored in the primary storage unit of the comprehensive determination unit 121.

なお、上述のステップＳ９５０とＳ９５１の所定の閾値としては、上述の場合の条件（１）〜（７）を考慮した閾値を用いることができる。
この閾値は、例えば、時刻の閾値としては、同一日や非同一日であるか否か、数時間以内であるか否か等に関する値を用いることができる。
位置の閾値としては、撮影位置の差が実際の距離（所定メートル等）以内であるか、同一建物（駅等）の構内である等の値を用いることができる。
さらに、これらの閾値は、実際のアプリケーションに従って調整して用いることができる。 In addition, as the predetermined threshold value in the above-described steps S950 and S951, a threshold value considering the conditions (1) to (7) in the above case can be used.
As this threshold value, for example, a value relating to whether or not it is the same day or a non-identical day, whether or not it is within several hours, or the like can be used.
As the position threshold value, a value such that the difference between the photographing positions is within an actual distance (such as a predetermined meter) or the premises of the same building (such as a station) can be used.
Furthermore, these threshold values can be adjusted and used according to the actual application.

また、複数の離れた位置に、近い時刻で検出された場合は、より「同一人物の可能性が高い」とすることもできる。 In addition, when detected at a plurality of distant positions at a close time, it can be further stated that “the possibility of the same person is high”.

（ステップＳ９５２）
ステップＳ９５２において、総合判定部１２１は、判定された値を用いて、撮影時刻情報と撮影位置情報の２者の関係に基づく重みの算出処理を行う。
この重みの算出は、ユーザにより時刻＋位置重み設定表６２０に入力された値を用いて行う。
具体的には、総合判定部１２１は、上述のステップＳ９５０とステップＳ９５１の判定結果を、図５の時刻＋位置重み設定表６２０に適用して、撮影時刻情報と撮影位置情報に基づく重みの算出を行う。
この処理を行った後、総合判定部１２１は、処理をステップＳ９５４に進める。 (Step S952)
In step S952, the overall determination unit 121 performs weight calculation processing based on the relationship between the shooting time information and the shooting position information using the determined value.
The calculation of the weight is performed using a value input to the time + position weight setting table 620 by the user.
Specifically, the overall determination unit 121 applies the determination results of steps S950 and S951 described above to the time + position weight setting table 620 in FIG. 5 to calculate weights based on the shooting time information and the shooting position information. I do.
After performing this process, the overall determination unit 121 advances the process to step S954.

（ステップＳ９５３）
ステップＳ９５３において、総合判定部１２１は撮影時刻情報と撮影位置情報の２者の関係に基づく重みの値を１とする。
これは、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、重み種類選択ボタン５１２が選択されていない場合である。 (Step S953)
In step S953, the overall determination unit 121 sets the weight value based on the relationship between the shooting time information and the shooting position information to 1.
This is a case where the weight type selection button 512 is not selected in the selection status of the weight type selection buttons 511 to 513 input in step S911.

（ステップＳ９５４）
ステップＳ９５４において、重み種類選択ボタン５１３が選択されているか判定する。
具体的には、総合判定部１２１は、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、重み種類選択ボタン５１３が選択されているかを判定する。
Ｙｅｓの場合、総合判定部１２１は、処理をステップＳ９５５に進める。
Ｎｏの場合、総合判定部１２１は、処理をステップＳ９５９に進める。 (Step S954)
In step S954, it is determined whether the weight type selection button 513 is selected.
Specifically, the overall determination unit 121 determines whether or not the weight type selection button 513 is selected in the selection status of the weight type selection buttons 511 to 513 input in step S911.
In Yes, comprehensive judgment part 121 advances processing to Step S955.
In No, the comprehensive determination part 121 advances a process to step S959.

（ステップＳ９５５）
以下、ステップＳ９５５〜Ｓ９５８において、総合判定部１２１は撮影位置情報と着衣情報の２者の関係に基づく重みの算出を行う。
まずは、ステップＳ９５５において、着衣類似度算出処理を行う。
具体的には、ステップＳ９４１で取得した着衣特徴量と記憶手段に記憶されている検索入力画像の着衣特徴量を基に、この映像ＩＤをもつ検索対象画像と検索入力画像との間で、着衣類似度を算出する。
着衣類似度の算出は、ステップＳ９４４の方法等と同様に行う。 (Step S955)
Thereafter, in steps S955 to S958, the comprehensive determination unit 121 calculates a weight based on the relationship between the shooting position information and the clothing information.
First, in step S955, clothing similarity calculation processing is performed.
Specifically, based on the clothing feature value acquired in step S941 and the clothing feature value of the search input image stored in the storage unit, the clothing between the search target image having this video ID and the search input image. Calculate similarity.
The calculation of the clothing similarity is performed in the same manner as the method in step S944.

（ステップＳ９５６）
ステップＳ９５６において、総合判定部１２１は、着衣の同一性判断処理を行う。
具体的には、総合判定部１２１は、ステップＳ９４５と同様に、算出した着衣類似度を所定の閾値と比較して、着衣の同一性を判断する。 (Step S956)
In step S956, the overall determination unit 121 performs clothing identity determination processing.
Specifically, as in step S945, the overall determination unit 121 compares the calculated clothing similarity with a predetermined threshold value to determine the identity of the clothing.

（ステップＳ９５７）
ステップＳ９５７において、位置の同一性判断処理を行う。
具体的には、総合判定部１２１は、ステップＳ９４１で取得した撮影位置と記憶手段に記憶されている検索入力画像の撮影位置から位置の近傍性を、所定の閾値と比較して判断する。 (Step S957)
In step S957, position identity determination processing is performed.
Specifically, the comprehensive determination unit 121 determines the proximity of the position from the shooting position acquired in step S941 and the shooting position of the search input image stored in the storage unit by comparing with a predetermined threshold.

（ステップＳ９５８）
ステップＳ９５８において、総合判定部１２１は、ステップＳ９５８の判定結果について、図５の位置＋着衣重み設定表６３０の値を適用して、撮影位置情報と着衣情報の２者の関係に基づく重みの算出を行う。 (Step S958)
In step S958, the overall determination unit 121 applies the value of the position + clothing weight setting table 630 in FIG. 5 to the determination result in step S958, and calculates the weight based on the relationship between the shooting position information and the clothing information. I do.

（ステップＳ９５９）
ステップＳ９５９において、総合判定部１２１は撮影位置情報と着衣情報の２者の関係に基づく重みの値を１とする。すなわち、ステップＳ９１１で入力された重み種類選択ボタン５１１〜５１３の選択状況において、重み種類選択ボタン５１３が選択されていない場合の重みの値は１となる。
なお、着衣情報が着衣未検出の状態であった場合にも、同様に重みの値を１とする。 (Step S959)
In step S959, the overall determination unit 121 sets the weight value based on the relationship between the shooting position information and the clothing information to 1 (one). That is, in the selection status of the weight type selection buttons 511 to 513 input in step S911, the weight value is 1 when the weight type selection button 513 is not selected.
It should be noted that the weight value is set to 1 in the same manner even when the clothing information is in a state where clothing is not detected.

（ステップＳ９６０）
ステップＳ９６０において、総合判定部１２１は、総合類似度を算出する。
具体的には、ステップＳ９４２にて算出した顔類似度に対し、それぞれ、ステップＳ９４２〜Ｓ９４８、Ｓ９４９〜Ｓ９５３、Ｓ９５４〜Ｓ９５９で算出した各重み値を掛け合わせて、総合類似度の値を算出する。
すなわち、総合類似度の値は：

総合類似度＝顔類似度×（時刻＋着衣重み設定値）×（時刻＋位置重み設定値）×（位置＋着衣重み設定値）

のような式により、算出する。
ここで、各重みの設定値群に対しては、上述の閾値による条件を基にした値を適用して計算する。この条件としては、各重み設定の表における位置や時刻の近傍性が、閾値以下か閾値を超えたかについて、上述の（１）〜（７）の条件を用いて判断する。
以上のように、着衣特徴量が直接、総合類似度に反映されるわけではない。 (Step S960)
In step S960, the overall determination unit 121 calculates the overall similarity.
Specifically, the face similarity calculated in step S942 is multiplied by the weight values calculated in steps S942 to S948, S949 to S953, and S954 to S959, respectively, thereby calculating the total similarity value. .
That is, the overall similarity value is:

Total similarity = face similarity × (time + clothing weight setting value) × (time + position weight setting value) × (position + clothing weight setting value)

It is calculated by the following formula.
Here, for the set value group of each weight, a value based on the above-described condition based on the threshold is applied and calculated. As this condition, whether the proximity of the position or time in the table of each weight setting is equal to or less than the threshold value or exceeds the threshold value is determined using the above conditions (1) to (7).
As described above, the clothing feature amount is not directly reflected in the overall similarity.

（ステップＳ９６１）
ステップＳ９６１において、総合判定部１２１は、検索出力画像判断処理を行う。
具体的には、総合判定部１２１は、算出した総合類似度を所定の閾値と比較して、検索出力画像を出力するか判断する。 (Step S961)
In step S961, the overall determination unit 121 performs a search output image determination process.
Specifically, the overall determination unit 121 compares the calculated overall similarity with a predetermined threshold value to determine whether to output a search output image.

（ステップＳ９６２）
ステップＳ９６２において、総合判定部１２１は、ステップＳ９６１で判断をした値が、閾値を上回る値であったかについて判定する。
Ｙｅｓの場合は、総合判定部１２１は、処理をステップＳ９６３に進める。
Ｎｏの場合は、総合判定部１２１は、処理をステップＳ９６４に進める。 (Step S962)
In step S962, the comprehensive determination unit 121 determines whether the value determined in step S961 is a value that exceeds the threshold value.
In Yes, comprehensive judgment part 121 advances processing to Step S963.
In No, the comprehensive determination part 121 advances a process to step S964.

（ステップＳ９６３）
ステップＳ９６３において、総合判定部１２１は、検索対象画像を検索出力画像の１つと判定し、該検索対象画像を映像ＩＤとともに一時的な記憶手段に記憶する。 (Step S963)
In step S963, the overall determination unit 121 determines that the search target image is one of the search output images, and stores the search target image in a temporary storage unit together with the video ID.

（ステップＳ９６４）
ステップＳ９６４において、総合判定部１２１は、参照行を１行下にずらす。
これにより、特徴量データベース８０１の末尾行（最終行）を超えるまで、次の行を参照することができる。末尾行は、特徴量データベース８０１の表の最終行である。 (Step S964)
In step S964, the comprehensive determination unit 121 shifts the reference row down by one row.
Thereby, the next line can be referred to until the end line (final line) of the feature amount database 801 is exceeded. The last row is the last row of the table of the feature amount database 801.

（ステップＳ９６５）
総合判定部１２１は、参照行が末尾行を超えたかについて判定する。
Ｙｅｓの場合、総合判定部１２１は、処理をステップＳ９６６に進める。
Ｎｏの場合、総合判定部１２１は、処理をステップＳ９４１に戻して、一連の処理を参照行が末尾行を超えるまで同様に繰り返す。 (Step S965)
The comprehensive determination unit 121 determines whether the reference line exceeds the end line.
In Yes, comprehensive judgment part 121 advances processing to Step S966.
In No, the comprehensive determination part 121 returns a process to step S941, and repeats a series of processes similarly until a reference line exceeds an end line.

（ステップＳ９６６）
ステップＳ９６６において、総合判定部１２１は、検索出力画像一覧リスト化処理を行う。
具体的には、総合判定部１２１は、ステップＳ９６５で参照行が末尾行を超えたと判定した場合、一時的な記憶手段に記憶された全ての総合類似度を映像ＩＤとともに取り出す。この上で、総合判定部１２１は、映像ＩＤの画像データを検索出力画像一覧としてリスト化して、一時的な記憶手段に記憶する。
このリスト化の際に、総合判定部１２１は、総合類似度、時刻順、カメラ順等の順番になるようにソート処理を実行するようにしてもよい。 (Step S966)
In step S966, the comprehensive determination unit 121 performs a search output image list listing process.
Specifically, when it is determined in step S965 that the reference row exceeds the end row, the overall determination unit 121 extracts all the overall similarities stored in the temporary storage unit together with the video ID. Then, the overall determination unit 121 lists the image data of the video ID as a search output image list and stores it in a temporary storage unit.
At the time of the listing, the comprehensive determination unit 121 may execute the sort process so that the total similarity, the time order, the camera order, and the like are in order.

（ステップＳ９６７）
ステップＳ９６７において、総合判定部１２１は、検索出力画像一覧をネットワーク部へ１１１へ出力する。 (Step S967)
In step S967, the overall determination unit 121 outputs the search output image list to the network unit 111.

（ステップＳ９６８）
ステップＳ９６８において、ネットワーク部１１１は、入力された検索出力画像一覧をネットワークを介して、監視端末１０３に送信する。この処理のタイミングは、タイミングＴ１０３である。 (Step S968)
In step S968, the network unit 111 transmits the input search output image list to the monitoring terminal 103 via the network. The timing of this process is timing T103.

（ステップＳ９６９）
ステップＳ９６９において、録画装置１０１は、次の検索実行要求の受信待ち状態となる。
ネットワーク部１１１は、次の検索実行要求を受信した場合は、処理をステップＳ９１１に戻す。ここまでが、録画装置１０１での処理である。
録画装置１０１は、この状態にてシャットダウン等を行うと、処理を終了する。
以上により、人物検索処理を終了する。 (Step S969)
In step S969, the recording apparatus 101 waits for reception of the next search execution request.
If the network unit 111 receives the next search execution request, the network unit 111 returns the process to step S911. Up to here is the processing in the recording apparatus 101.
The recording apparatus 101 ends the process when performing a shutdown or the like in this state.
Thus, the person search process is completed.

このように構成することで、以下のような効果が得られる。
まず、従来技術１においては、正面方向以外の方向から映した顔画像に対する認識精度は低いという問題があった。また、顔の大きさが小さい場合には、認識精度も低いという問題があった。
これにより、人物検索方法を実用的に用いることが難しいという問題があった。このため、人物検索機能は、本格的には利用されていないという問題があった。 By configuring in this way, the following effects can be obtained.
First, the prior art 1 has a problem that recognition accuracy for a face image projected from a direction other than the front direction is low. Further, when the face size is small, there is a problem that the recognition accuracy is low.
As a result, there is a problem that it is difficult to use the person search method practically. For this reason, the person search function has a problem that it is not used in earnest.

しかしながら、本発明の実施の形態に係る人物検索方法においては、顔認識の精度がそれほど高くなくても、精度の高い人物検索方法を提供することができる。
これは、顔特徴量に加えて、服装の特徴である着衣特徴量を用い、さらに時刻・位置・着衣のルールを用い、さらに重み付けの値を算出して適用することにより実現することができる。
これにより、ユーザが所望の人物の映像を、より効率よく簡単に見つけることが可能となる。
よって、監視システムにおいて、人物検索方法を実用的に用いることが可能となる。 However, the person search method according to the embodiment of the present invention can provide a highly accurate person search method even if the accuracy of face recognition is not so high.
This can be realized by using a clothing feature value which is a feature of the clothing in addition to the face feature value, further using a time / position / clothing rule, and further calculating and applying a weighting value.
Thereby, it becomes possible for the user to find a video of a desired person more efficiently and easily.
Therefore, the person search method can be practically used in the monitoring system.

このように人物検索方法の精度が向上することにより、その時点で録画されている全画像に映された全ての人物について、その人物の含まれている画像を検索することも可能である。
その検索した画像の映像ＩＤを用いることで、位置と時間の情報を得ることができる。
これにより、特定の場所や時間での人の流れについてリサーチを行うことが可能になる。
たとえば、あるビル内での人の動きを検索して、より効率的な店舗の配置等に用いることができる。
また、例えば、遊園地等の遊戯施設内で、入場者がアトラクションを廻る順番等が把握できるため、各アトラクションの値段を適切に設定したり、遊戯施設内の通路を最適化することも可能である。
なお、多数の人物について検索を行う場合は、プライバシーに配慮して、各検索結果については記録を残さずに、各映像ＩＤの移動の情報といった値のみを統計処理することが望ましい。 Thus, by improving the accuracy of the person search method, it is also possible to search for an image including the person for all persons shown in all images recorded at that time.
By using the video ID of the searched image, position and time information can be obtained.
This makes it possible to conduct research on the flow of people at specific locations and times.
For example, the movement of a person in a certain building can be searched and used for more efficient store arrangements.
Also, for example, in an amusement facility such as an amusement park, it is possible to grasp the order in which the visitors go around the attraction, so it is possible to set the price of each attraction appropriately and optimize the passage in the amusement facility. is there.
When searching for a large number of persons, it is desirable to statistically process only the values such as movement information of each video ID without leaving a record for each search result in consideration of privacy.

また、従来は、連続的な相関モデル等により人物検索を行う方法が用いられることもあった。ところが、人物検索が用いられる状況は様々であり、連続的な相関モデルによれば、これらの様々な状況に対応できないという問題があった。
これに対して、本発明においては、人物検索における着衣特徴量との関係において、位置や時刻が同一／非同一で単純に条件により場合分けして、それぞれの重みを直接に指定できるようにした。
これにより、様々な状況に対応でき、検索者の意図した検索が可能であるため、使い勝手がよくなるという効果が得られる。 Conventionally, a method of performing a person search using a continuous correlation model or the like is sometimes used. However, there are various situations in which person search is used, and there is a problem that the continuous correlation model cannot cope with these various situations.
On the other hand, in the present invention, the position and time are the same / non-identical in terms of the relationship with the clothing feature value in the person search, and the weights can be specified directly according to the conditions. .
As a result, it is possible to deal with various situations, and the search intended by the searcher is possible, so that an effect of improving usability can be obtained.

なお、本発明の実施の形態においては、検索対象画像は、その時点で録画されている全画像を対象とした場合を説明したが、当然、時刻やカメラ等が限定された画像を対象とすることも可能である。
さらに、特定の性別や年齢（大人か子供か）の対象を検索することも可能である。この性別と年齢は、顔領域や着衣の特徴量から及び背の高さ等から判断することができる。 In the embodiment of the present invention, the search target image has been described for all images recorded at that time, but of course, the target is an image with a limited time, camera, etc. It is also possible.
Furthermore, it is possible to search for an object of a specific gender and age (adult or child). This gender and age can be determined from the face area, the feature amount of the clothes, the height of the back, and the like.

また、本発明の実施の形態においては、録画装置１０１内で検索出力画像一覧を完全に求めてから監視端末１０３に送信するようにしたが、ステップＳ９６２において、検索出力画像と判定される毎に監視端末１０３に逐次送信するようにしてもよい。 In the embodiment of the present invention, the search output image list is completely obtained in the recording apparatus 101 and then transmitted to the monitoring terminal 103. However, every time it is determined as a search output image in step S962. You may make it transmit to the monitoring terminal 103 sequentially.

また、本発明の実施の形態においては、検索入力画像中に存在する人物が各１名である場合について示したが、検索入力画像中に複数人が存在する場合には、それぞれの人物に対する検索結果をＯＲした結果をユーザに返すようにしてもよい。
また、この場合、監視端末１０３の制御部は、複数人が存在する旨をユーザ側に通知し、検索対象とする１名を監視端末１０３のポインティングデバイス等で指定・選択するようにユーザインタフェイスにより選択させ、ユーザにより選択された１名に対して実施するようにしてもよい。 Further, in the embodiment of the present invention, the case where one person exists in the search input image is shown. However, when there are a plurality of persons in the search input image, the search for each person is performed. The result of ORing the results may be returned to the user.
In this case, the control unit of the monitoring terminal 103 notifies the user that there are a plurality of people, and the user interface is configured so that one person to be searched is designated and selected by the pointing device or the like of the monitoring terminal 103. The selection may be performed by the user, and may be performed for one person selected by the user.

また、説明の簡略化のため、本図においては撮像装置１０２や監視端末１０３を、各１台の構成で示した。しかし、これらの装置は、ネットワーク１００に複数台、接続することが可能である。
また、本例においては、検索の画像の指定のみを端末で行い、録画と検索とを同一の装置で実行する構成を示した。しかし、録画と検索とについても別々の装置で実行するようにしてもよい。 For simplification of description, the imaging apparatus 102 and the monitoring terminal 103 are shown as a single unit in the drawing. However, a plurality of these devices can be connected to the network 100.
Further, in this example, the configuration is shown in which only the search image is designated by the terminal, and the recording and the search are executed by the same device. However, recording and searching may be executed by separate devices.

また、本発明の実施の形態においては、ユーザの指示に従い、映像ＩＤを含む検索付随情報のみを監視端末１０３から録画装置１０１に送信し、画像データそのものは送信しないような説明をした。しかしながら、監視端末１０３から画像データそのものを送信して、この画像データを基に、録画装置１０１が各種検索処理を行うことが可能である。
さらに、本発明の実施の形態においては、検索対象画像は録画された画像から選択するように説明した。しかし、例えば、撮像装置１０２とは異なる機器により撮影した画像を、検索対象画像とすることもできる。 Further, in the embodiment of the present invention, in accordance with the user's instruction, only the search accompanying information including the video ID is transmitted from the monitoring terminal 103 to the recording apparatus 101, and the image data itself is not transmitted. However, the image data itself can be transmitted from the monitoring terminal 103, and the recording apparatus 101 can perform various search processes based on the image data.
Further, in the embodiment of the present invention, it has been described that the search target image is selected from the recorded images. However, for example, an image captured by a device different from the imaging device 102 can be used as a search target image.

また、上述の映像ＩＤとしては、録画開始時からの経過時間、特定のシグナル（顔特徴量など）を検出した際に割り当てられるユニークな数字のＩＤを用いてもよい。また、検索用のタグデータ、ハッシュデータ、その他、検索を容易にするための様々な情報を用いることもできる。
また、上述の入力する重みの設定値は、割合の値で与えるように示したが、加算値（又は減算値）のような形式の「重みの重み」のような調整値を用いてもよい。また、重みの設定値に加えて、「重みの重み」の調整値を加えるように構成することもできる。 Further, as the above-described video ID, an elapsed time from the start of recording, or a unique numerical ID assigned when a specific signal (such as a facial feature amount) is detected may be used. Also, search tag data, hash data, and other various information for facilitating the search can be used.
Further, although the above-described input weight setting value is shown as a ratio value, an adjustment value such as “weight weight” in the form of an addition value (or subtraction value) may be used. . Further, an adjustment value of “weight of weight” may be added in addition to the set value of weight.

なお、上記実施の形態の構成及び動作は例であって、本発明の趣旨を逸脱しない範囲で適宜変更して実行することができることは言うまでもない。 Note that the configuration and operation of the above-described embodiment are examples, and it is needless to say that the configuration and operation can be appropriately changed and executed without departing from the gist of the present invention.

本発明の実施の形態に係る監視システムＹのシステム構成図である。It is a system configuration figure of monitoring system Y concerning an embodiment of the invention. 本発明の実施の形態に係る顔特徴量や着衣特徴量や撮影時刻や撮影位置を示す特徴量データベース８０１の構成を示す概念図である。It is a conceptual diagram which shows the structure of the feature-value database 801 which shows the face feature-value, clothing feature-value, imaging | photography time, and imaging | photography position which concern on embodiment of this invention. 本発明の実施の形態に係る録画装置１０１における、映像録画時の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process at the time of video recording in the video recording apparatus 101 which concerns on embodiment of this invention. 本発明の実施の形態に係る監視端末１０３における、人物検索のためのユーザインタフェイスの例を示す図である。It is a figure which shows the example of the user interface for the person search in the monitoring terminal 103 which concerns on embodiment of this invention. 本発明の実施の形態に係る監視端末１０３における、重み設定画面の例を示す図である。It is a figure which shows the example of the weight setting screen in the monitoring terminal 103 which concerns on embodiment of this invention. 本発明の実施の形態に係る人物検索方法実行のタイミングチャートである。It is a timing chart of person search method execution concerning an embodiment of the invention. 本発明の実施の形態に係る人物検索方法実行時の監視端末１０３における処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process in the monitoring terminal 103 at the time of the person search method execution which concerns on embodiment of this invention. 本発明の実施の形態に係る人物検索方法実行時の録画装置１０１における処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process in the video recording apparatus 101 at the time of the person search method execution which concerns on embodiment of this invention. 本発明の実施の形態に係る人物検索方法実行時の録画装置１０１における処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process in the video recording apparatus 101 at the time of the person search method execution which concerns on embodiment of this invention. 本発明の実施の形態に係る人物検索方法実行時の録画装置１０１における処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process in the video recording apparatus 101 at the time of the person search method execution which concerns on embodiment of this invention. 本発明の実施の形態に係る人物検索方法実行時の録画装置１０１における処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process in the video recording apparatus 101 at the time of the person search method execution which concerns on embodiment of this invention. 従来の人物検索機能を備えた監視システムＸのシステム構成図である。It is a system block diagram of the monitoring system X provided with the conventional person search function. 従来の監視端末２０３における、人物検索のためのユーザインタフェイスの例を示す図である。It is a figure which shows the example of the user interface for the person search in the conventional monitoring terminal 203. FIG.

Explanation of symbols

１００、２００ネットワーク
１０１、２０１録画装置
１０２、２０２撮像装置
１０３、２０３監視端末
１１１、２１１ネットワーク部
１１２、２１２記憶部
１１３、２１３顔領域検出部
１１４、２１４顔特徴量抽出部
１１５、２１５顔特徴量記録部
１１６着衣領域検出部
１１７着衣特徴量抽出部
１１８着衣特徴量記録部
１１９撮影時刻記録部
１２０撮影位置記録部
１２１総合判定部
２１６顔判定部
３０１再生映像表示部
３０２再生操作部
３０３カメラ切替操作部
３０４検索入力画像指定部
３０５、５０５検索操作部
３０６検索出力画像一覧表示部
５１１〜５１３重み種類選択ボタン
５１４重み設定ボタン
６１０時刻＋着衣重み設定表
６１１〜６１４、６２１〜６２４、６３１〜６３４設定値入力欄
６２０時刻＋位置重み設定表
６３０位置＋着衣重み設定表
６４１適用ボタン
６４２閉じるボタン
８０１特徴量データベース
８１１映像ＩＤ列
８１２顔特徴量列
８１３着衣特徴量列
８１４撮影時刻列
８１５撮影位置列
Ｘ、Ｙ映像監視システム 100, 200 Network 101, 201 Recording device 102, 202 Imaging device 103, 203 Monitoring terminal 111, 211 Network unit 112, 212 Storage unit 113, 213 Face area detection unit 114, 214 Face feature amount extraction unit 115, 215 Face feature amount Recording unit 116 Clothing area detection unit 117 Clothing feature amount extraction unit 118 Clothing feature amount recording unit 119 Shooting time recording unit 120 Shooting position recording unit 121 Overall determination unit 216 Face determination unit 301 Playback video display unit 302 Playback operation unit 303 Camera switching operation Unit 304 Search input image designation unit 305, 505 Search operation unit 306 Search output image list display unit 511-513 Weight type selection button 514 Weight setting button 610 Time + clothing weight setting table 611-614, 621-624, 631-634 Setting Value input field 620 Time + position weight setting Table 630 position + clothes weight setting table 641 Apply button 642 close button 801 feature database 811 video ID column 812 facial feature column 813 clothing feature value column 814 shooting time column 815 shooting position row X, Y video surveillance system

Claims

In a person search method in a surveillance system for recording images,
The shooting time information, the shooting position information, the face feature amount, and the clothing information respectively obtained from the image data of a plurality of recorded images are respectively stored in a randomly accessible storage medium ,
The photographing time information and the vicinity of or near or non vicinity, the clothes information of four values corresponding to the combination of conditions of four kinds of identical or non-identical or identity Time + Clothing weight setting value group,
The photographing time information and the vicinity of or near or non vicinity, time + position weight set of four values corresponding to the combination of conditions of four kinds of the photographing position information near or non-neighboring or Locality Value group,
Position + clothing weight setting value consisting of four numerical values corresponding to combinations of four conditions of proximity of whether the photographing position information is near or non-near and identity of the clothing information being the same or non-identical Receiving at least one input of the group in advance,
When a search instruction is received along with the search input image ,
For each of the plurality of recorded images, one of four numerical values of the input weight setting value group is selected according to the relationship between the recorded image and the search input image, and the recorded image and the difference between the face feature amount of the search input image using the general similarity obtained by weighted numerical values the selected person search method characterized by determining the identity of the recorded person.

The time + clothing weight setting value group, the time + position weight setting value group, and the position + clothing weight setting value group include a time + clothing weight setting table in which the four numeric input fields are arranged in a matrix, It is input or changed by the user through a weight setting screen having a time + position weight setting table and a position + clothing setting table,
The search instruction includes an arbitrary plurality of designations among the time + clothing weight setting value group, the time + position weight setting value group, and the position + clothing weight setting value group,
The proximity of the shooting time information, the identity of the clothing information, and the proximity of the shooting position information are determined by comparison with corresponding predetermined threshold values, respectively.
The total similarity is obtained by multiplying a numerical value selected from each of the specified plurality of weight setting value groups and weighting the difference of the face feature amount.
The person search method according to claim 1.

The time + clothing weight setting value group, the time + position weight setting value group, and the position + clothing weight setting value group include the following seven conditions:
(1) A person with the same clothes is likely to be the same person at a nearby time
(2) The person with the same clothes may be the same person even at a time that is not near
(3) Persons with non-identical clothes are likely not to be the same person at nearby times
(4) It is highly possible that the person in the vicinity and the position in the vicinity are the same person
(5) It is highly possible that people in the vicinity time and non-neighboring positions are not the same person
(6) Non-near time, people at nearby positions may be the same person
(7) Persons in the vicinity and the same clothes may be the same person
The person search method according to claim 1, wherein the person search method is set based on the above.

A monitoring system that records an image with an imaging device, a recording device, and a monitoring terminal,
The recording device comprises:
Facial feature amount extraction means for extracting a facial feature amount from an image captured by the imaging device;
Clothing feature value extraction means for extracting a clothing feature value from an image captured by the imaging device;
At least one or more of image data of a plurality of recorded images, photographing time information, photographing position information, facial feature amount extracted by the facial feature amount extraction unit, and clothing information extracted by the clothing feature amount extraction unit Storing means in a randomly accessible storage medium ;
About the plurality of recorded images,
Time + clothing weight setting value consisting of four numerical values corresponding to four combinations of the proximity of whether the shooting time information is near or non-near and the identity of the clothes information being the same or non-identical group,
Time + position weight setting consisting of four numerical values corresponding to four combinations of the proximity of whether the photographing time information is near or non-neighbor and the proximity of whether the photographing position information is near or non-neighbor Value group,
Position + clothing weight setting value consisting of four numerical values corresponding to combinations of four conditions of proximity of whether the photographing position information is near or non-near and identity of the clothing information being the same or non-identical At least one of the group
Is selected according to the relationship between the recorded image and the search input image, and the total similarity obtained by weighting the difference between the recorded image and the face input of the search input image by the selected numerical value is selected. And a general judging means for judging the identity of the recorded person.