JP2016197345A

JP2016197345A - Image analyzer, image analysis method and program

Info

Publication number: JP2016197345A
Application number: JP2015077145A
Authority: JP
Inventors: 祐一常松; Yuichi Tsunematsu
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-04-03
Filing date: 2015-04-03
Publication date: 2016-11-24

Abstract

PROBLEM TO BE SOLVED: To improve the accuracy in tracking processing.SOLUTION: The image analyzer acquires plural images constituting a moving image, associates at least either one of a feature value and a position of a subject which is detected from the acquired image with a subject in another image, updates a feature value of the associated subject, and determines whether the subject detected from the acquired image is similar to the another subject. The image analyzer does not update the feature value of the subject which is determined to be similar in the subjects detected from the acquired image.SELECTED DRAWING: Figure 2

Description

本発明は被写体の特徴情報を用いた画像解析方法に関する。 The present invention relates to an image analysis method using feature information of a subject.

ネットワークに接続されたカメラによる撮像画像を解析するための画像解析技術として、例えば顔認識技術が知られている。また、顔認識技術を用いて構築された人物検索装置は、特定人物や迷子が映り込んでいる映像の絞り込みに使われている。 As an image analysis technique for analyzing an image captured by a camera connected to a network, for example, a face recognition technique is known. In addition, a person search apparatus constructed using face recognition technology is used for narrowing down images in which a specific person or a lost child is reflected.

人物検索装置では、検索対象となる人物の顔や、服装から得られる特徴情報(特徴量)が事前に取得・解析され、検索しやすい形式で記録される。応答速度、検索精度などの仕様を満足するため、人物検索装置では検索対象となる人物の上限数が決められている。カメラから取得された画像の各フレームや、検出された各結果に対して抽出された特徴量がそのまま人物検索装置へ登録されると、類似した情報が多数記録され、人物検索装置の能力が圧迫されてしまう。結果として、検索可能な人物数の減少、検索時間の長期化が引き起こされ、システムパフォーマンスの低下につながる。 In the person search device, the face of the person to be searched and the feature information (feature amount) obtained from the clothes are acquired and analyzed in advance and recorded in a format that is easy to search. In order to satisfy specifications such as response speed and search accuracy, an upper limit number of persons to be searched is determined in the person search apparatus. If each frame of the image acquired from the camera and the feature quantity extracted for each detected result are registered in the person search device as they are, a lot of similar information is recorded and the ability of the person search device is compressed. Will be. As a result, the number of searchable people is reduced and the search time is prolonged, which leads to a decrease in system performance.

この課題に対して、特許文献1には、人物の見え方が最適な候補画像を選択する第1のサーバと、画像検索用キーを生成する第2のサーバを用意したシステムが記載されている。また特許文献１には、第１のサーバにより選択された画像が第１のサーバから第２のサーバへ伝送され、第２のサーバで処理されることが記載されている。また、特許文献2では、検出した複数の動体を、大きさを元に人や車といった種別へ分類し、種別ごとに記憶部に格納する方法が開示されている。いずれの方法も何らかの手段で同一の人物であることを判定し、抽出した特徴量をまとめて人物検索装置へ送ることにより、登録処理・検索処理の効率化を図っている。 To deal with this problem, Patent Document 1 describes a system in which a first server that selects a candidate image with the best appearance of a person and a second server that generates an image search key are described. . Patent Document 1 describes that an image selected by a first server is transmitted from the first server to the second server and processed by the second server. Patent Document 2 discloses a method in which a plurality of detected moving objects are classified into types such as a person and a vehicle based on the size, and stored in a storage unit for each type. In any of the methods, it is determined that the person is the same person by some means, and the extracted feature values are collectively sent to the person search device, thereby improving the efficiency of the registration process and the search process.

特許第3612220号号公報Japanese Patent No. 3612220 特開2013-239205号公報JP 2013-239205 A

しかしながら、特許文献1や特許文献2の方法を用いて追尾処理を行った場合、追尾処理の精度が低下してしまうケースが考えられる。追尾処理とは、複数のフレームで検出された人物の同定を行う処理である。すなわち、特許文献1や特許文献2ではそれぞれ、連続する2フレームの差分、検出した動体サイズといった非常に簡単な情報を用いて、フレーム間をまたいだ人物の同定を行っている。一方、追尾処理は追尾中の人物の交差や混雑の環境を苦手とする。このような環境では、追尾が途切れたり、追尾対象と他の人物が入れ替わったりすることがある。そのため、特許文献1及び2の方法で追尾処理を行うと、追尾処理の精度が低下する恐れがある。また、このような環境で抽出した追尾対象の特徴量が人物検索装置へ記録されると、人物検索システムとしての精度低下につながる可能性がある。 However, when the tracking process is performed using the methods of Patent Document 1 and Patent Document 2, there may be a case where the accuracy of the tracking process decreases. The tracking process is a process for identifying a person detected in a plurality of frames. In other words, Patent Document 1 and Patent Document 2 each identify a person across frames using very simple information such as a difference between two consecutive frames and a detected moving object size. On the other hand, the tracking process is not good at the intersection or crowded environment of the person being tracked. In such an environment, tracking may be interrupted, or the tracking target may be switched with another person. Therefore, when the tracking process is performed by the methods of Patent Documents 1 and 2, the accuracy of the tracking process may be reduced. Further, if the tracking target feature amount extracted in such an environment is recorded in the person search device, there is a possibility that the accuracy of the person search system may be reduced.

本発明は上記課題に鑑みてなされたものであり、追尾処理の精度を向上させることを目的とする。 The present invention has been made in view of the above problems, and an object thereof is to improve the accuracy of tracking processing.

上記目的を達成するための一手段として、本発明の画像解析装置は以下の構成を有する。すなわち、動画を構成する複数の画像を取得する取得手段と、前記取得手段により取得された画像から検出された被写体の特徴量と位置とのうち、少なくとも何れかを用いて他の画像の被写体と対応付ける第１の統合手段と、前記第１の統合手段により対応付けされた被写体の特徴量を更新する更新手段と、前記取得手段により取得された画像から検出された被写体が他の被写体と交錯しているか否かを判定する交錯判定手段とを有し、前記更新手段は、前記取得手段により取得された画像から検出された被写体のうち、前記交錯判定手段により交錯していると判定された被写体の特徴量を更新しないことを特徴とする。 As a means for achieving the above object, the image analysis apparatus of the present invention has the following configuration. That is, an acquisition unit that acquires a plurality of images constituting a moving image, and a subject of another image using at least one of the feature amount and position of the subject detected from the image acquired by the acquisition unit A first integrating means for associating, an updating means for updating the feature quantity of the subject associated by the first integrating means, and a subject detected from the image acquired by the acquiring means intersect with another subject. A crossing judging means for judging whether or not the subject has been detected by the crossing judging means among the subjects detected from the image obtained by the obtaining means. The feature amount is not updated.

本発明によれば、追尾処理の精度を向上させることが可能となる。 According to the present invention, the accuracy of the tracking process can be improved.

第１の実施形態におけるネットワーク接続構成を示す図。The figure which shows the network connection structure in 1st Embodiment. 第１実施形態における人物検索システムの機能ブロックの一例を示す図。The figure which shows an example of the functional block of the person search system in 1st Embodiment. 第１実施形態における人物検索システムのハードウェア構成の一例を示す図。The figure which shows an example of the hardware constitutions of the person search system in 1st Embodiment. 画像解析装置３００での処理フロー図。FIG. 6 is a processing flowchart in the image analysis apparatus 300. 交錯判定部３０３で実施する交錯判定を説明する図。The figure explaining the crossing determination implemented in the crossing determination part 303. FIG. 画像解析装置３００で抽出する特徴量を説明する図。The figure explaining the feature-value extracted by the image analysis apparatus 300. FIG.

以下、添付の図面を参照して、本発明をその好適な実施形態に基づいて詳細に説明する。なお、以下の実施形態において示す構成は一例に過ぎず、本発明は図示された構成に限定されるものではない。 Hereinafter, the present invention will be described in detail based on preferred embodiments with reference to the accompanying drawings. The configurations shown in the following embodiments are merely examples, and the present invention is not limited to the illustrated configurations.

［第１の実施形態］
図１は、第１の実施形態における人物検索システム（画像解析システム）のネットワーク接続構成を示す図である。第１の実施形態における人物検索システムの構成要素として、ネットワークカメラ（以下、カメラ）１００、ネットワークストレージ装置（以下、ストレージ装置）２００、画像解析装置３００、人物検索装置４００、画像表示装置５００が、ネットワーク回線であるＬＡＮ６００によって接続されている。カメラ１００は撮像装置であり、符号化された画像データをＬＡＮ６００経由で配信する機能を有する。ストレージ装置２００は、カメラ１００から配信される画像データを、ＬＡＮ６００を介して受信して記録する。 [First Embodiment]
FIG. 1 is a diagram illustrating a network connection configuration of a person search system (image analysis system) according to the first embodiment. As components of the person search system in the first embodiment, a network camera (hereinafter referred to as camera) 100, a network storage apparatus (hereinafter referred to as storage apparatus) 200, an image analysis apparatus 300, a person search apparatus 400, and an image display apparatus 500 are provided. It is connected by a LAN 600 that is a network line. The camera 100 is an imaging device and has a function of distributing encoded image data via the LAN 600. The storage device 200 receives and records image data distributed from the camera 100 via the LAN 600.

画像解析装置３００は、カメラ１００又はストレージ装置２００に記録された画像（動画）データをＬＡＮ６００経由で読み込み、後述する人物検索システムの登録情報を生成する。人物検索装置４００は、検索対象となる人物の特徴量を取得し、それらを記録する。また、人物検索装置４００は、ユーザからの検索指示に基づいて人物検索処理を実行する。画像表示装置５００は、カメラ１００から配信される画像データ、およびストレージ装置２００に記録された画像データの再生・表示を行う。また、画像表示装置５００は、検索用のユーザインタフェースを表示し、ユーザ指示に基づいて検索対象や人物属性情報を人物検索装置４００へ送信する。 The image analysis device 300 reads image (moving image) data recorded in the camera 100 or the storage device 200 via the LAN 600, and generates registration information of a person search system described later. The person search device 400 acquires feature quantities of a person to be searched and records them. The person search device 400 executes person search processing based on a search instruction from the user. The image display device 500 reproduces and displays image data distributed from the camera 100 and image data recorded in the storage device 200. Further, the image display device 500 displays a search user interface, and transmits a search target and person attribute information to the person search device 400 based on a user instruction.

図２は、本実施形態における人物検索システムの機能ブロックの一例を示す図である。カメラ１００は画像取得部１０１、符号化部１０２、および通信部１０３から構成される。画像取得部により取得され画像処理が施された画像データは、符号化部１０２によりＬＡＮ６００で通信できる形式へと変換される。変換された画像データは、通信部１０３を通じてストレージ装置２００、画像解析装置３００および画像表示装置５００へと配信される。ストレージ装置２００は、ＬＡＮ６００から通信部２０１を介して配信画像を受信し、記録部２０２に記録する。 FIG. 2 is a diagram illustrating an example of functional blocks of the person search system in the present embodiment. The camera 100 includes an image acquisition unit 101, an encoding unit 102, and a communication unit 103. Image data acquired by the image acquisition unit and subjected to image processing is converted into a format that can be communicated by the LAN 600 by the encoding unit 102. The converted image data is distributed to the storage device 200, the image analysis device 300, and the image display device 500 through the communication unit 103. The storage apparatus 200 receives the distribution image from the LAN 600 via the communication unit 201 and records it in the recording unit 202.

画像解析装置３００は、ＬＡＮ６００から通信部３０１を介して、カメラ１００、およびストレージ装置２００に記録された画像を読み込む。画像解析装置３００は、人物検出部３０２、第1の統合部３０３、第1の判定部３０４、第2の統合部３０５にてフレームをまたいだ人物の同定を行う。次に、画像解析装置３００は、第２の判定部３０６と特徴量抽出部３０７にて特徴量の抽出・更新を行う。次に、画像解析装置３００は、登録判定部３０８１および登録特徴量送信部３０８２から構成される人物登録部３０８にて登録する人物特徴量を人物検索装置４００へ送る。 The image analysis device 300 reads images recorded in the camera 100 and the storage device 200 from the LAN 600 via the communication unit 301. In the image analysis apparatus 300, the person detection unit 302, the first integration unit 303, the first determination unit 304, and the second integration unit 305 identify a person across frames. Next, in the image analysis apparatus 300, the second determination unit 306 and the feature amount extraction unit 307 perform feature amount extraction / update. Next, the image analysis apparatus 300 sends the person feature amount registered by the person registration unit 308 including the registration determination unit 3081 and the registered feature amount transmission unit 3082 to the person search device 400.

人物検索装置４００は、ＬＡＮ６００から通信部４０１を介して登録人物情報、および人物検索要求を受け取り記録部４０２、および検索部４０３でそれぞれ処理を実行する。最後に画像表示装置５００はＬＡＮ６００から通信部５０１を介して配信画像、および人物検索結果を受け取り、表示部５０２において画面に表示する。 The person search apparatus 400 receives registered person information and a person search request from the LAN 600 via the communication unit 401, and executes processes in the recording unit 402 and the search unit 403, respectively. Finally, the image display apparatus 500 receives the distribution image and the person search result from the LAN 600 via the communication unit 501 and displays them on the screen in the display unit 502.

図３は、本実施形態における人物検索システムの各構成要素のハードウェア構成の一例を示す図である。ＣＰＵ３１は、各構成要素の動作を制御する。ＲＯＭ３２は、制御命令つまりプログラムを格納する。ＲＡＭ３３は、プログラムを実行する際のワークメモリやデータの一時保存などに利用される。通信部３４は、外部の装置と通信するための制御を行う。表示部３５は、各種表示を行う。ユーザＩ／Ｆ（インタフェース）３６は、利用者の操作を受け付ける。人物検索システムの各構成要素１００〜５００による処理は、ＣＰＵ３１がＲＯＭ３２に格納されたプログラムを実行することにより行われてもよく、また、専用のハードウェアにより行われてもよい。 FIG. 3 is a diagram illustrating an example of a hardware configuration of each component of the person search system according to the present embodiment. The CPU 31 controls the operation of each component. The ROM 32 stores control instructions, that is, programs. The RAM 33 is used for temporary storage of work memory and data when executing a program. The communication unit 34 performs control for communicating with an external device. The display unit 35 performs various displays. A user I / F (interface) 36 receives user operations. The processing by each component 100 to 500 of the person search system may be performed by the CPU 31 executing a program stored in the ROM 32, or may be performed by dedicated hardware.

次に、本実施形態における人物検索システムの処理について詳細を説明する。まず、カメラ１００が行う処理について説明する。画像取得部１０１は、CMOSなどの撮像素子から取得したデジタル電気信号に対して、所定の画素補間や色変換処理を行ない、RGBやYUVなどのデジタル画像を現像する。また、画像取得部１０１は、現像を施した後のデジタル画像に対してホワイトバランス、シャープネス、コントラスト、色変換などの画像補正処理を行い、画像データを生成する。画像取得部１０１により生成された画像データは、符号化部１０２へ渡される。符号化部１０２は、ネットワークを介して画像を配信するためにJPEGやMotion JPEG、H.264といった圧縮フォーマットで画像データの符号化を施す。符号化部１０２で符号化された画像データは、通信部１０３を通してＬＡＮ６００を介し、ストレージ装置２００、画像解析装置３００、および画像表示装置５００へ渡される。 Next, details of the processing of the person search system in the present embodiment will be described. First, processing performed by the camera 100 will be described. The image acquisition unit 101 performs predetermined pixel interpolation and color conversion processing on a digital electric signal acquired from an image sensor such as a CMOS, and develops a digital image such as RGB or YUV. In addition, the image acquisition unit 101 performs image correction processing such as white balance, sharpness, contrast, and color conversion on the developed digital image to generate image data. The image data generated by the image acquisition unit 101 is passed to the encoding unit 102. The encoding unit 102 encodes image data in a compression format such as JPEG, Motion JPEG, or H.264 in order to distribute an image via a network. The image data encoded by the encoding unit 102 is passed to the storage device 200, the image analysis device 300, and the image display device 500 via the communication unit 103 and the LAN 600.

ストレージ装置２００では、通信部２０１が画像データを受信し、記録部２０２は受信された画像データをストレージに記録する。記録部２０２は、配信時の画像ファイルフォーマットそのままでストレージに記録してもよく、また、読み書きのアクセス性能向上のため複数ファイル、または複数ディスクに分割してストレージに記録してもよい。また、ストレージ装置２００は、画像解析装置３００、および画像表示装置５００からの要求に応じて、過去映像の画像解析・表示のために、記録済みの画像データを通信部２０１経由で配信する処理も行う。 In the storage device 200, the communication unit 201 receives the image data, and the recording unit 202 records the received image data in the storage. The recording unit 202 may record the image file format at the time of distribution in the storage as it is, or may divide it into a plurality of files or a plurality of disks and record it in the storage in order to improve read / write access performance. The storage device 200 also performs processing for distributing recorded image data via the communication unit 201 for image analysis / display of past video in response to requests from the image analysis device 300 and the image display device 500. Do.

画像解析装置３００は、カメラ１００およびストレージ装置２００から受け取ったデータに対して統合処理を行い、人物検索装置４００へ送信する人物特徴量の抽出を行う。図４に画像解析処理のフローを示す。なお、本実施形態ではカメラ１００、ストレージ装置２００、画像解析装置３００、人物検索装置４００、及び画像表示装置５００が、それぞれ別の装置である場合の例を説明しているが、この例に限らない。例えば、カメラ１００と画像解析装置３００が一体の装置であっても良いし、カメラ１００トストレージ装置２００が一体の装置であっても良いし、画像解析装置３００と人物検索装置４００が一体の装置であっても良い。 The image analysis apparatus 300 performs integration processing on the data received from the camera 100 and the storage apparatus 200 and extracts a person feature amount to be transmitted to the person search apparatus 400. FIG. 4 shows a flow of image analysis processing. In the present embodiment, an example in which the camera 100, the storage device 200, the image analysis device 300, the person search device 400, and the image display device 500 are different devices has been described. However, the present embodiment is not limited to this example. Absent. For example, the camera 100 and the image analysis device 300 may be an integrated device, the camera 100 storage device 200 may be an integrated device, or the image analysis device 300 and the person search device 400 may be integrated. It may be.

まずステップＳ４００１にて、通信部３０１は、カメラ１００と必要に応じてストレージ装置２００から画像データを読み込む。次にステップＳ４００２で、人物検出部３０２と第1の統合部３０３により、第1の統合処理が行われる。第1の統合処理では、連続する画像フレーム間に登場する人物、すなわち追尾対象の同定が行われる。第1の統合処理のために、まず、人物検出部３０２は、追尾技術を用いてSIFT特徴量等の特徴量を検出する。第1の統合部３０３は、人物検出部３０２により検出された特徴量を用いて、連続する画像フレーム間で同一の追尾対象の同定を行い、同定した追尾対象に対してユニークなIDを付与する。すなわち、第1の統合部３０３は、連続する画像フレーム間で同一の追尾対象を対応付ける。 First, in step S4001, the communication unit 301 reads image data from the camera 100 and the storage apparatus 200 as necessary. In step S4002, the person detection unit 302 and the first integration unit 303 perform a first integration process. In the first integration process, identification of a person appearing between successive image frames, that is, a tracking target is performed. For the first integration process, first, the person detection unit 302 detects a feature quantity such as an SIFT feature quantity using a tracking technique. The first integration unit 303 identifies the same tracking target between successive image frames using the feature amount detected by the person detection unit 302, and assigns a unique ID to the identified tracking target. . That is, the first integration unit 303 associates the same tracking target between successive image frames.

次にステップＳ４００３では、第1の判定部３０４は、第1の統合部３０３によりIDが付与された検出対象を参照し、新規の追尾対象が発生したか否かを確認する。第1の統合部３０３により、全ての追尾対象にはユニークなIDが付与されているため、判定部は、前フレームの画像を処理した際のIDを記録しておくことにより、新規の追尾対象が発生したか否かを判定できる。新規の追尾対象が発生した場合（ステップＳ４００３でＮＯ）、第2の統合部３０５は、ステップＳ４００４で第2の統合処理を行う。なお、第2の統合処理を省略し、処理はステップＳ３００３の後ステップＳ３００５へ進んでもよい。 In step S4003, the first determination unit 304 refers to the detection target to which the ID is given by the first integration unit 303, and confirms whether a new tracking target has occurred. Since the first integration unit 303 assigns a unique ID to all tracking targets, the determination unit records a new tracking target ID by recording the ID when the image of the previous frame is processed. Whether or not has occurred can be determined. When a new tracking target is generated (NO in step S4003), the second integration unit 305 performs a second integration process in step S4004. Note that the second integration process may be omitted, and the process may proceed to step S3005 after step S3003.

第2の統合部３０５は、新規に発生した追尾対象が、過去に検出された追尾対象のいずれかと一致するかを判定する。具体的には、第2の統合部３０５は、顔認識技術を用いて、新規に発生した追尾対象のスコア値を算出し、該スコア値に基づいて、新規に発生した追尾対象が、過去に検出された追尾対象のいずれかと一致するか否かを判定する。新規に発生した追尾対象が過去に検出された追尾対象のいずれかと一致すると判定された場合、第2の統合部３０５は、該2つの追尾対象は同一であるとしてユニークなIDを付与する。すなわち、第2の統合部３０５は、現画像フレームと過去の画像フレーム間で同一の追尾対象を対応付ける。例えば、第2の統合部３０５は、過去に検出された追尾対象のIDを新規に発生した追尾対象に付与する。一方、新規に発生した追尾対象が過去に検出された追尾対象のいずれかと一致しないと判定された場合、新規に発生した追尾対象は第1の統合部３０３により付与されたIDを有した状態となる。 The second integration unit 305 determines whether the newly generated tracking target matches any of the tracking targets detected in the past. Specifically, the second integration unit 305 calculates the score value of the newly generated tracking target using the face recognition technology, and based on the score value, the newly generated tracking target is It is determined whether or not it matches any of the detected tracking targets. When it is determined that the newly generated tracking target matches any of the tracking targets detected in the past, the second integration unit 305 assigns a unique ID, assuming that the two tracking targets are the same. That is, the second integration unit 305 associates the same tracking target between the current image frame and the past image frame. For example, the second integration unit 305 gives the tracking target ID detected in the past to the newly generated tracking target. On the other hand, if it is determined that the newly generated tracking target does not match any of the tracking targets detected in the past, the newly generated tracking target has a state having the ID assigned by the first integration unit 303. Become.

次に、ステップＳ４００５では、第2の判定部３０６は、検出された追尾対象が交錯しているか否かを判定する。一般的な追尾処理では、追尾対象が何等かの物体に一時隠れることに対応するため、同一対象であるかが多少疑わしい状況であっても、一時隠れる前と後の対象を統合されている。したがって、これにより、人物の交差、および混雑環境下では、人物情報が混在してしまう可能性がある。そこで本実施形態における第2の判定部３０６は、追尾対象の位置情報を検出し、該位置情報を用いることで交錯しているか否かを判定する。交錯しているか否かを判定するための条件としては、他の追尾対象との距離、および他の追尾対象がいくつ存在するかといったことが考えられる。 Next, in step S4005, the second determination unit 306 determines whether or not the detected tracking target is interlaced. In a general tracking process, in order to deal with the case where the tracking target is temporarily hidden behind some object, the target before and after being temporarily hidden is integrated even if the same target is somewhat doubtful. Accordingly, there is a possibility that the person information is mixed in the intersection of people and in a crowded environment. Therefore, the second determination unit 306 in the present embodiment detects position information of the tracking target, and determines whether or not they are interlaced by using the position information. As conditions for determining whether or not they are interlaced, there may be a distance from other tracking targets and how many other tracking targets exist.

図５は、交錯領域の考え方を示したものである。追尾対象の位置情報としては顔領域や人物枠の矩形情報が取得できる。第2の判定部３０６は、例えば矩形中心の座標をそれぞれ計算し、他の座標とのユークリッド距離を求めることで、追尾対象間の距離を求める。また、第2の判定部３０６は、他の追尾対象がいくつ存在するかを、所定の閾値1以上に距離が近い検出枠の数を数えることで計算することができる。第2の判定部３０６は、この値を所定の閾値２と比較することで交錯領域か否かを追尾対象枠毎に判断することができる。交錯判定の結果、交錯状態にあると判定された場合（ステップＳ３００５でＹＥＳ）、ステップＳ３００６で第1の統合処理を中断する（ステップＳ４００６）。すなわち、交錯状態にあると判定された場合、後に特徴量が抽出されても、その抽出量を用いて更新処理は行われない。この処理は、例えば、第1の統合部３０３により検出された追尾結果が、以降の処理に渡されないことで実現できる。これにより、交差状態にあると判定された追尾対象は、ステップＳ４００７以降の処理対象とはならず、特徴量の更新は行われない。 FIG. 5 shows the concept of the crossing area. As the tracking target position information, rectangular information of the face area and the person frame can be acquired. For example, the second determination unit 306 calculates the coordinates of the rectangle center, and calculates the distance between the tracking targets by calculating the Euclidean distance from the other coordinates. Further, the second determination unit 306 can calculate how many other tracking targets exist by counting the number of detection frames that are close to a predetermined threshold 1 or more. The second determination unit 306 can determine, for each tracking target frame, whether or not it is a crossing region by comparing this value with a predetermined threshold 2. As a result of the crossing determination, if it is determined that they are in the crossing state (YES in step S3005), the first integration process is interrupted in step S3006 (step S4006). That is, when it is determined that the two are in a mixed state, even if a feature amount is extracted later, the update process is not performed using the extracted amount. This process can be realized by, for example, the tracking result detected by the first integration unit 303 not being passed to the subsequent processes. Thereby, the tracking target determined to be in the intersecting state is not a processing target after step S4007, and the feature amount is not updated.

ステップＳ４００７では、特徴量抽出部３０７は、追尾対象毎に人物特徴量を抽出し、更新する。図６は、抽出される人物特徴量の例を示している。顔器官位置とその周囲の色の勾配情報を表すSIFT、LBP、HOGなどの低レベル特徴量の他に年齢、性別、眼鏡、ひげ、髪色、服色といった高レベルの特徴量もある。1フレームにおける抽出結果だけでは認識処理の結果が安定せず、振れることがあるため、複数フレームにわたって結果を総合的に判断すると、より高い精度の解析結果が取得できる。例えば、過去フレームを含めて多数決処理を行い、最も多く検出された結果を採用することで実現できる。顔器官の位置や年齢、色情報といったバリエーションがるものでは平均値を求めることで同様に安定した解析結果を得ることができる。 In step S4007, the feature amount extraction unit 307 extracts and updates the person feature amount for each tracking target. FIG. 6 shows an example of the extracted person feature amount. In addition to low-level features such as SIFT, LBP, and HOG that represent gradient information of facial organ positions and surrounding colors, there are high-level features such as age, sex, glasses, beard, hair color, and clothing color. Since the result of the recognition process is not stable only with the extraction result in one frame and may fluctuate, an analysis result with higher accuracy can be obtained by comprehensively judging the result over a plurality of frames. For example, it can be realized by performing majority processing including the past frame and adopting the most detected result. If there are variations such as facial organ position, age, and color information, a stable analysis result can be obtained in the same manner by obtaining an average value.

ステップＳ４００８では、登録判定部３０８１は、追尾対象の人物の特徴量を登録可能かを判定する。登録判定部３０８１は、例えば、第1の統合処理の結果として得られる追尾開始からの経過秒数、交錯判定結果として得られる直前に交錯条件下に入っていないか、第2の統合処理の結果として得られる交錯条件下に入ってから一定の時間が経過していないかといった情報に基づいて登録可能か否かを判定することができる。また、登録判定部３０８１は、人物特徴量取得結果を参照し、抽出・更新した人物属性がどの程度の期間変化がないかを参照することにより、十分な数の特徴量が取得できたかを判断し、登録可能かどうかの判定基準とすることができる。 In step S4008, the registration determination unit 3081 determines whether the feature amount of the tracking target person can be registered. The registration determination unit 3081 is, for example, the number of seconds elapsed from the start of tracking obtained as a result of the first integration process, whether the crossing condition is not entered immediately before being obtained as a result of the intersection determination, or the result of the second integration process It is possible to determine whether or not registration is possible based on information such as whether a certain time has not passed since the crossing condition obtained. Also, the registration determination unit 3081 refers to the person feature amount acquisition result, and refers to how long the extracted / updated person attribute has not changed, thereby determining whether a sufficient number of feature amounts has been acquired. And can be used as a criterion for registration.

追尾対象の人物の特徴量を登録可能と判定された場合（ステップＳ４００８のでＹＥＳ）、最後にステップＳ４００９では、登録特徴量送信部３０８２は、登録対象となった人物特徴量を人物検索装置４００へ伝送して処理を完了する。 If it is determined that the feature quantity of the tracking target person can be registered (YES in step S4008), finally, in step S4009, the registered feature quantity transmission unit 3082 sends the person feature quantity to be registered to the person search device 400. Transmit and complete the process.

人物検索装置４００は、通信部４０１を介して画像解析装置３００から送られる人物情報の登録処理、および画像表示装置５００から送られる人物情報の検索処理を行う。渡された人物情報は、後で検索しやすい形式で記録部４０２にて蓄積される。具体的には、人物検索装置４００は、検索応答時間ができるだけ短くなるように、登録情報として渡される特徴量に基づいてクラスタリング処理を事前に施したり、検索インデックスや検索対象のデータをメモリ上にあらかじめ置いたりしく。 The person search apparatus 400 performs a process for registering person information sent from the image analysis apparatus 300 via the communication unit 401 and a process for searching for person information sent from the image display apparatus 500. The passed person information is stored in the recording unit 402 in a format that can be easily searched later. Specifically, the person search device 400 performs clustering processing in advance based on the feature amount passed as registration information so that the search response time is as short as possible, or stores the search index and search target data in the memory. Like in advance.

最後に、画像表示装置５００は、通信部５０１を介して読み込んだ配信画像データ、人物検索結果、および検索用のユーザインタフェースを、表示部５０２で表示する。人物検索結果は、人物検索装置４００から返される情報を元に、時系列順、類似度順、地図上へ重畳表示などの工夫を施されてユーザにとって分かりやすい形式で表示される。また、画像表示装置５００は、検索用のユーザインタフェースを介して、ユーザ操作により検索対象や人物属性情報を取得し、取得した検索対象や人物属性情報を、人物検索装置４００への問い合わせ時に使用する。 Finally, the image display apparatus 500 displays the distribution image data read through the communication unit 501, the person search result, and the search user interface on the display unit 502. Based on the information returned from the person search device 400, the person search result is displayed in a format that is easy for the user to understand, such as chronological order, similarity order, and overlay display on the map. Further, the image display apparatus 500 acquires the search target and person attribute information by a user operation via a search user interface, and uses the acquired search target and person attribute information when making an inquiry to the person search apparatus 400. .

このように、本実施形態によれば、追尾が途切れたり、追尾対象と他の人物が入れ替わるような、混雑した環境において抽出された追尾対象の特徴量は更新されず、人物検索システムとしての精度低下を防ぐことが可能となる。言い換えれば、交錯時か否かによって人物特徴量の統合方法を切り替えることで、人物検索装置に登録される特徴量の効率化を図り、人物検索システムの性能向上、応答時間の短縮、および検索精度の向上が実現される。なお、本実施形態では追尾対象は人物であるとして説明をしたが、人物以外の被写体であっても構わない。 As described above, according to the present embodiment, the tracking target feature quantity extracted in a crowded environment where tracking is interrupted or the tracking target is replaced with another person is not updated, and accuracy as a person search system is not updated. It is possible to prevent the decrease. In other words, by switching the person feature amount integration method depending on whether or not it is at the time of crossing, the efficiency of the feature amount registered in the person search device is improved, the performance of the person search system is improved, the response time is shortened, and the search accuracy Improvement is realized. In the present embodiment, the tracking target has been described as a person, but a subject other than a person may be used.

［その他の実施形態］
以上、実施形態例を詳述したが、本発明は例えば、システム、装置、方法、プログラム若しくは記録媒体(記憶媒体)等としての実施形態をとることが可能である。具体的には、複数の機器（例えば、ホストコンピュータ、インタフェース機器、撮像装置、webアプリケーション等）から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 [Other Embodiments]
Although the embodiment has been described in detail above, the present invention can take an embodiment as a system, apparatus, method, program, recording medium (storage medium), or the like. Specifically, the present invention may be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, an imaging device, a web application, etc.), or may be applied to a device composed of a single device. good.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１００ネットワークカメラ、１０１画像取得部、１０２符号化部、１０３通信部、２００ネットワークストレージ装置、２０１通信部、２０２記録部、３００画像解析装置、３０１通信部、３０２人物検出部、３０３第1の統合部、３０４第1の判定部、３０５第2の統合部、３０６第2の判定部、３０７特徴量取得部、３０８人物登録部、３０８１登録判定部、３０８２登録特徴量送信部、４００人物検索バ装置、４０１通信部、４０２記録部、４０３検索部、５００画像表示装置、５０１通信部、５０２表示部、６００ＬＡＮ 100 network camera, 101 image acquisition unit, 102 encoding unit, 103 communication unit, 200 network storage device, 201 communication unit, 202 recording unit, 300 image analysis device, 301 communication unit, 302 human detection unit, 303 first integration , 304 first determination unit, 305 second integration unit, 306 second determination unit, 307 feature amount acquisition unit, 308 person registration unit, 3081 registration determination unit, 3082 registered feature amount transmission unit, 400 person search bar Device 401 communication unit 402 recording unit 403 search unit 500 image display device 501 communication unit 502 display unit 600 LAN

Claims

Acquisition means for acquiring a plurality of images constituting the video;
First integration means for associating with a subject of another image using at least one of the feature amount and position of the subject detected from the image acquired by the acquisition means;
Updating means for updating the feature quantity of the subject associated by the first integration means;
Crossing determination means for determining whether or not the subject detected from the image acquired by the acquisition means crosses with another subject,
The update means does not update the feature quantity of the subject determined to be interlaced by the crossing determination means among the subjects detected from the image acquired by the acquisition means.

The image analysis apparatus according to claim 1, wherein the first integration unit performs a tracking process for identifying a subject appearing in a continuous image acquired by the acquisition unit.

A second integration unit that associates a subject that has not been associated with the first integration unit with the same subject that has been detected in the past based on a feature amount of the subject;
The image analysis apparatus according to claim 1, wherein the update unit further updates the feature amount of the subject associated by the second integration unit.

The image analysis apparatus according to claim 3, wherein the second integration unit performs face recognition processing for identifying a subject appearing in a past image acquired by the acquisition unit.

The crossing determining means crosses whether or not the images acquired by the acquiring means are crossed based on the number of subjects associated with the first integrating means and the number of other subjects and the distance between them. The image analysis apparatus according to claim 1, wherein the determination is performed.

The image analysis apparatus according to claim 1, further comprising a transmission unit configured to transmit the updated feature amount to the external apparatus in order to register the updated feature quantity.

The image analysis apparatus according to claim 6, further comprising a registration determination unit that determines whether a predetermined condition for registering with the external apparatus is satisfied.

An acquisition step of acquiring a plurality of images constituting the video;
A first integration step of associating with a subject of another image using at least one of the feature amount and the position of the subject detected from the image acquired in the acquisition step;
An update step of updating the feature amount of the subject associated in the first integration step;
A crossing determination step of determining whether or not the subject detected from the image acquired in the acquisition step is crossing with another subject,
In the update step, the feature amount of the subject determined to be interlaced in the crossing determination step among the subjects detected from the image acquired in the acquisition step is not updated. .

A program that causes a computer to function as each unit of the image analysis apparatus according to claim 1.