JP6440749B2

JP6440749B2 - Surveillance image processing apparatus and surveillance image processing method

Info

Publication number: JP6440749B2
Application number: JP2017005215A
Authority: JP
Inventors: 亮和田
Original assignee: Toshiba Teli Corp
Current assignee: Toshiba Teli Corp
Priority date: 2017-01-16
Filing date: 2017-01-16
Publication date: 2018-12-19
Anticipated expiration: 2037-01-16
Also published as: JP2018117181A

Description

この実施形態は、監視画像処理装置及び監視画像処理方法に関する。 This embodiment relates to a monitoring image processing apparatus and a monitoring image processing method.

単眼カメラを用いた監視装置が従来から開発されている。単眼カメラに３６０°レンズ（魚眼レンズ）を取り付けて、撮像視野を拡大した監視装置も開発されている。また、魚眼レンズを用いて、例えば会議室の様子を撮像して広角視野監視画像を取得し、この広角視野監視画像をパノラマ画像に展開し、展開したパノラマ画像を用いて、顔検出を行う監視装置も開発されている。 A monitoring device using a monocular camera has been developed. A monitoring device has also been developed in which a 360 ° lens (fisheye lens) is attached to a monocular camera and the imaging field of view is enlarged. In addition, a monitoring apparatus that uses a fish-eye lens to capture, for example, a state of a conference room to acquire a wide-angle visual field monitoring image, expand the wide-angle visual field monitoring image into a panoramic image, and perform face detection using the expanded panoramic image Has also been developed.

特開２０１５−１９１６２号公報JP, 2015-19162, A 特開２０１６−２５５１６号公報JP 2016-25516 A 特開２０１６−３９５３９号公報JP 2016-39539 A

３６０°レンズ（魚眼レンズ）を利用して撮像視野を拡大した監視装置は、さらなる改善が求められている。 Further improvement is demanded for a monitoring apparatus that uses a 360 ° lens (fisheye lens) to expand the imaging field of view.

例えば、特定の人物の追跡処理が正確で、耐性がある（ロバストである）ことが要望されている。人物の顔がカメラに対して向き合っているときは、追跡処理は比較的安定している。これは、撮影した人物の顔の特徴パターンと監視装置に記録された比較パターン（通常は正面から見た顔の特徴パターンが記録される）とを比較したとき、両者が類似する確率が高いからである。 For example, there is a demand for the tracking process of a specific person to be accurate and resistant (robust). The tracking process is relatively stable when the human face is facing the camera. This is because when a photographed person's facial feature pattern is compared with a comparison pattern recorded in the monitoring device (usually a facial feature pattern seen from the front is recorded), there is a high probability that they are similar. It is.

しかしながら、監視対象となる人物の顔を正面から安定して捕えることができない場合がある。例えば追跡中の人物が向きを変えて、カメラからみてその人物の顔を横あるいは後ろから撮像するような状態となった場合である。このような状態が比較的長い時間続いた場合、監視装置は、特定の人物を追跡することができなくなる。つまり、対象となる人物の追跡処理ができない状態（ロスト状態）になる。 However, there are cases where the face of the person to be monitored cannot be captured stably from the front. For example, this is a case where the person being tracked changes his / her direction and takes a picture of the person's face from the side or behind the camera. When such a state lasts for a relatively long time, the monitoring device cannot track a specific person. That is, the target person cannot be tracked (lost state).

そこで本実施形態によれば、追跡する対象物（例えば人物等）の追跡能力を向上した監視画像処理装置及び監視画像処理方法を提供することを目的とする。 Therefore, according to the present embodiment, an object is to provide a monitoring image processing apparatus and a monitoring image processing method in which the tracking ability of an object to be tracked (for example, a person) is improved.

実施形態の一例である装置によれば、撮像部と、データ処理装置と、表示デバイスを備え、前記撮像部が対象物を含む空間を撮影した広域画像データを前記データ処理装置に入力する装置であって、According to an apparatus which is an example of an embodiment, the apparatus includes an imaging unit, a data processing device, and a display device, and the imaging unit inputs wide-area image data obtained by capturing a space including an object to the data processing device. There,
前記データ処理装置は、The data processing device includes:
前記対象物の個別特徴（顔）を追跡する第１の画像処理部と、前記対象物の動き特徴を追跡する第２の画像処理部と、前記個別特徴（顔）の画像を含む第１注目画像（顔画像）を切り出して前記表示デバイスに供給する注目画像出力部を備え、A first image processing unit that tracks individual features (faces) of the object, a second image processing unit that tracks movement features of the objects, and a first attention including an image of the individual features (faces) A noticeable image output unit that cuts out an image (face image) and supplies it to the display device,
前記第１の画像処理部は、前記個別特徴（顔）の画像を検出している場合、前記個別特徴（顔）の画像を囲む矩形情報に基づいて、前記広域画像データから前記第１注目画像（顔画像）の画像を切出すための第１の画像領域情報を生成し、When the first image processing unit detects the image of the individual feature (face), the first image of interest is extracted from the wide area image data based on rectangular information surrounding the image of the individual feature (face). Generating first image area information for cutting out an image of (face image);
前記第２の画像処理部は、前記第１の画像処理部が前記個別特徴（顔）の画像の追跡を失った場合、画像の動き検出情報に基づいて第２注目画像（動体画像）を追跡して、前記広域画像データから前記第２注目画像（動体画像）を切出すことができる第２の画像領域情報を生成し、The second image processing unit tracks a second attention image (moving body image) based on the motion detection information of the image when the first image processing unit loses tracking of the image of the individual feature (face). And generating second image area information capable of cutting out the second attention image (moving body image) from the wide area image data,
前記第１の画像処理部が前記個別特徴（顔）を追跡している間は、前記第２の画像処理部は、前記第２注目画像（動体画像）を切出すための前記第２の画像領域情報として、前記第１の画像処理部が生成している前記第１の画像領域情報に基づいて前記第２注目画像（動体画像）を仮想追跡し、While the first image processing unit is tracking the individual feature (face), the second image processing unit is configured to extract the second image for cutting out the second attention image (moving body image). As the area information, the second image of interest (moving body image) is virtually tracked based on the first image area information generated by the first image processing unit,
前記注目画像出力部は、少なくとも前記第１の画像領域情報に基づいて、前記広域画像データから前記個別特徴（顔）の画像を切り出して前記表示デバイスに供給する。The attention image output unit cuts out an image of the individual feature (face) from the wide area image data based on at least the first image area information and supplies the image to the display device.

実施形態の他の装置によれば、撮像部と、データ処理装置と、表示デバイスを備え、前記撮像部が対象物を含む空間を撮影した広域画像データを前記データ処理装置に入力する装置であって、According to another apparatus of the embodiment, the apparatus includes an imaging unit, a data processing device, and a display device, and the imaging unit inputs wide-area image data obtained by capturing a space including an object to the data processing device. And
前記データ処理装置は、The data processing device includes:
前記対象物の個別特徴（顔）を追跡する第１の画像処理部と、前記対象物の動き特徴を追跡する第２の画像処理部と、前記個別特徴（顔）の画像を含む第１注目画像（顔画像）を切り出して前記表示デバイスに供給する注目画像出力部を備え、A first image processing unit that tracks individual features (faces) of the object, a second image processing unit that tracks movement features of the objects, and a first attention including an image of the individual features (faces) A noticeable image output unit that cuts out an image (face image) and supplies it to the display device,
前記第１の画像処理部は、前記個別特徴（顔）の画像を検出している場合、前記個別特徴（顔）の画像を囲む矩形情報に基づいて、前記広域画像データから前記第１注目画像（顔画像）の画像を切出すための第１の画像領域情報を生成し、When the first image processing unit detects the image of the individual feature (face), the first image of interest is extracted from the wide area image data based on rectangular information surrounding the image of the individual feature (face). Generating first image area information for cutting out an image of (face image);
前記第２の画像処理部は、前記第１の画像処理部が前記個別特徴（顔）の画像の追跡を失った場合、画像の動き検出情報に基づいて第２注目画像（動体画像）を追跡して、前記広域画像データから前記第２注目画像（動体画像）を切出すことができる第２の画像領域情報を生成し、The second image processing unit tracks a second attention image (moving body image) based on the motion detection information of the image when the first image processing unit loses tracking of the image of the individual feature (face). And generating second image area information capable of cutting out the second attention image (moving body image) from the wide area image data,
前記第２の画像処理部が前記動き特徴を追跡している間は、前記第１の画像処理部は、第１注目画像（顔画像）を切り出すための前記第１の画像領域情報として、前記第２の画像処理部が生成している前記第２の画像領域情報に基づいて前記個別特徴（顔）を仮想追跡し、While the second image processing unit is tracking the motion feature, the first image processing unit uses the first image area information for cutting out a first attention image (face image) as the first image area information. The individual feature (face) is virtually tracked based on the second image area information generated by the second image processing unit,
前記注目画像出力部は、少なくとも前記第１の画像領域情報に基づいて、前記広域画像データから前記個別特徴（顔）の画像を切り出して前記表示デバイスに供給する。The attention image output unit cuts out an image of the individual feature (face) from the wide area image data based on at least the first image area information and supplies the image to the display device.

図１は、本実施形態の監視画像処理装置が使用されている全方位監視画像表示処理システムの一例を概略的に示すブロック図である。FIG. 1 is a block diagram schematically showing an example of an omnidirectional monitoring image display processing system in which the monitoring image processing apparatus of the present embodiment is used. 図２は、図１に示したブロック構成をさらに具体化した一例を示すブロック図である。FIG. 2 is a block diagram illustrating an example in which the block configuration illustrated in FIG. 1 is further embodied. 図３は、本実施形態の監視画像処理装置の構成とその動作を概略的に示す説明図である。FIG. 3 is an explanatory diagram schematically showing the configuration and operation of the monitoring image processing apparatus of the present embodiment. 図４は、本実施形態の監視画像処理装置において、顔検知による追跡処理と動体検知による追跡処理が行われる際の状態遷移ルートの一例を示す図である。FIG. 4 is a diagram illustrating an example of a state transition route when a tracking process based on face detection and a tracking process based on motion detection are performed in the monitoring image processing apparatus according to the present embodiment. 図５は、本実施形態に関わる装置の動作を概略的に示すフローチャートである。FIG. 5 is a flowchart schematically showing the operation of the apparatus according to the present embodiment. 図６は、図５に示した顔矩形画像を追尾する追尾処理工程をさらに具体的示すフローチャートである。FIG. 6 is a flowchart showing more specifically the tracking processing step for tracking the face rectangular image shown in FIG. 図７は、図６のフローチャートの続きを示すフローチャートである。FIG. 7 is a flowchart showing a continuation of the flowchart of FIG. 図８は、図５に示した動体特徴を追尾する追尾処理工程をさらに具体的示すフローチャートである。FIG. 8 is a flowchart showing more specifically the tracking processing step for tracking the moving object feature shown in FIG. 図９は、本実施形態の監視画像処理装置における全方位高精細監視画像とパノラマ画像の説明図である。FIG. 9 is an explanatory diagram of an omnidirectional high-definition monitoring image and a panoramic image in the monitoring image processing apparatus of the present embodiment. 図１０は、本実施形態の監視画像処理装置により表示デバイスに画像が表示された一例を示す図である。FIG. 10 is a diagram illustrating an example in which an image is displayed on the display device by the monitoring image processing apparatus of the present embodiment. 図１１は、第１の画像処理部Ａ１における個別特徴（顔）の追跡がロスト状態になったときに表示デバイスに表示される画像の例を示す図である。FIG. 11 is a diagram illustrating an example of an image displayed on the display device when tracking of individual features (faces) in the first image processing unit A1 is in a lost state.

以下、実施の形態について図面を参照して説明する。図１には、一実施形態である監視画像処理装置が適用された全方位監視（広角視野監視と称してもよい）画像表示処理システムを示している。実施形態では、１．３メガ（Ｍ）ピクセル以上の画素数を有する画素（高精細画素）の単眼カメラ（１台のｉｐカメラ）１０１（撮像部と称してもよい）を用いている。この単眼カメラ１０１に３６０°レンズ（魚眼レンズ）が取り付けられ、この単眼カメラ１０１は、全方位高精細監視画像を取得可能である。 Hereinafter, embodiments will be described with reference to the drawings. FIG. 1 shows an omnidirectional monitoring (also referred to as wide-angle visual field monitoring) image display processing system to which a monitoring image processing apparatus according to an embodiment is applied. In the embodiment, a monocular camera (one ip camera) 101 (which may be referred to as an imaging unit) having a pixel number (high definition pixel) having a number of pixels of 1.3 mega (M) pixels or more is used. A 360 ° lens (fisheye lens) is attached to the monocular camera 101, and the monocular camera 101 can acquire an omnidirectional high-definition monitoring image.

取得された全方位高精細監視画像のデータストリームは、データ処理装置２００Ａ内のストリーム受信部で受信されて、キャプチャ２０１によりフレーム単位でキャプチャされ、画像バッファメモリ２０２に入力される。全方位高精細監視画像は、表示画面で見た場合、画像フレームに対して撮像領域が楕円状に配置されて表示される。全方位高精細監視画像は２５６０×１９２０ピクセル以上の高解像度を持つ画像である。 The acquired data stream of the omnidirectional high-definition monitoring image is received by the stream receiving unit in the data processing apparatus 200 </ b> A, captured by the capture 201 in units of frames, and input to the image buffer memory 202. When viewed on the display screen, the omnidirectional high-definition monitoring image is displayed with an imaging region arranged in an elliptical shape with respect to the image frame. The omnidirectional high-definition monitoring image is an image having a high resolution of 2560 × 1920 pixels or more.

図１は一実施形態の基本的なブロック図である。実施形態によれば、撮像部１０１と、データ処理装置２００Ａと、表示デバイス３２１を備え、撮像部１０１が対象物を含む空間を撮影した広域画像データをデータ処理装置２００Ａに入力する装置である。ここでデータ処理装置は、対象物の個別特徴（顔）を追跡する第１の画像処理部Ａ１と、対象物の動き特徴を追跡する第２の画像処理部Ａ２と、注目画像を切り出して表示デバイス３２１に供給する注目画像出力部Ａ３を備える。 FIG. 1 is a basic block diagram of an embodiment. According to the embodiment, the image capturing unit 101, the data processing device 200A, and the display device 321 are devices that input wide-area image data obtained by capturing the space including the object to the data processing device 200A. Here, the data processing apparatus cuts out and displays a target image, a first image processing unit A1 that tracks individual features (faces) of the object, a second image processing unit A2 that tracks the movement features of the object, and the like. An attention image output unit A3 to be supplied to the device 321 is provided.

そして、第１の画像処理部Ａ１は、個別特徴（顔）の画像を検出している場合、その個別特徴（顔）を追跡しながら、個別特徴（顔）の画像が位置する画像領域を囲む第１の矩形情報（顔検知矩形座標情報或いは第１の画像領域情報と称しても良い）を生成する。また、第２の画像処理部Ａ２は、第１の画像処理部Ａ１が個別特徴（顔）の追跡を失った場合、追跡を失った位置に存在する画像の動き検出情報に基づいて動き特徴を追跡しながら、動き特徴が位置する画像領域を囲む第２の矩形情報（動体検知矩形座標情報或いは第２の画像領域情報と称しても良い）を生成する。 Then, when the image of the individual feature (face) is detected, the first image processing unit A1 surrounds the image region where the image of the individual feature (face) is located while tracking the individual feature (face). First rectangular information (which may be referred to as face detection rectangular coordinate information or first image area information) is generated. In addition, when the first image processing unit A1 loses tracking of the individual feature (face), the second image processing unit A2 calculates the motion feature based on the motion detection information of the image existing at the position where the tracking is lost. While tracking, second rectangular information (also referred to as moving object detection rectangular coordinate information or second image area information) surrounding the image area where the motion feature is located is generated.

そして注目画像出力部Ａ３は、少なくとも第１の矩形情報（顔検知矩形座標情報或いは第１の画像領域情報）を用いて広域画像データから出力用注目画像を切り出して表示デバイス３２１に供給する。第２の矩形情報（動体検知矩形座標情報或いは第２の画像領域情報）は、必ずしも常に用いる必要はない。 The attention image output unit A3 cuts out the attention image for output from the wide area image data using at least the first rectangular information (face detection rectangular coordinate information or first image area information) and supplies the image to the display device 321. The second rectangular information (moving object detection rectangular coordinate information or second image area information) is not always required to be used.

注目画像出力部Ａ３は、第１の画像処理部Ａ１が個別特徴（顔）の画像を検出したときは、第１の矩形情報（顔検知矩形座標情報或いは第１の画像領域情報）を優先して使用する。 The attention image output unit A3 gives priority to the first rectangular information (face detection rectangular coordinate information or first image region information) when the first image processing unit A1 detects the image of the individual feature (face). To use.

なお第２注目画像の出力表示形態は、各種可能である。例えば、第１注目画像（顔画像）のみを切り出して出力し、第２注目画像（動体画像）が切り出し可能な場合は、例えば「ロスト中」のメッセージが表示デバイス３２１に出力されてもよい。或いは、動体画像（顔なし或いは疑似的に顔を示した画像）が、矩形枠内に表示されてもよい。 Various output display forms of the second attention image are possible. For example, when only the first attention image (face image) is cut out and output, and the second attention image (moving body image) can be cut out, for example, a message “lost” may be output to the display device 321. Alternatively, a moving body image (an image without a face or a pseudo face) may be displayed in a rectangular frame.

上記第１の画像処理部Ａ１は、個別特徴（顔）の追跡を失っている（ロストしている）ときは、前記第２の画像処理部Ａ２が生成している第２の矩形情報を参照して、仮想的に個別特徴（顔）の追跡を行ってもよい。このときは、第１の画像処理部Ａ１は、第２の矩形情報で表される領域よりも少し拡大した領域を示す第１の矩形情報を用いて仮想追跡を行ってもよい。拡大の程度は、例えば矩形画像が数十ピクセル上下及び左右に拡大される程度である。 The first image processing unit A1 refers to the second rectangular information generated by the second image processing unit A2 when the individual feature (face) is not tracked (lost). Then, the individual feature (face) may be tracked virtually. At this time, the first image processing unit A1 may perform virtual tracking using the first rectangular information indicating an area slightly enlarged from the area represented by the second rectangular information. The degree of enlargement is such that, for example, a rectangular image is enlarged several tens of pixels vertically and horizontally.

逆に、第１の画像処理部Ａ１が、個別特徴（顔）の追跡を行っているとき（ロストしていないとき）は、第２の画像処理部Ａ２は、第１の矩形情報を参照して、動体を仮想追跡してもよい。このときも、第２の画像処理部Ａ２は、第１の画像処理部Ａ１が個別特徴（顔）の追跡を失った場合、続けて動体の追跡を実行しながら、第２の矩形情報を生成する。このときは、第２の画像処理部Ａ２は、第１の矩形情報で表される領域よりも少し拡大した領域を示す第２の矩形情報を用いて仮想追跡を行ってもよい。拡大の程度は、例えば矩形画像が数十ピクセル上下及び左右に拡大される程度である。 Conversely, when the first image processing unit A1 is tracking the individual feature (face) (when it is not lost), the second image processing unit A2 refers to the first rectangular information. Thus, the moving object may be virtually tracked. Also at this time, when the first image processing unit A1 loses the tracking of the individual feature (face), the second image processing unit A2 generates the second rectangular information while continuously tracking the moving object. To do. At this time, the second image processing unit A2 may perform virtual tracking using the second rectangular information indicating a region slightly enlarged from the region represented by the first rectangular information. The degree of enlargement is such that, for example, a rectangular image is enlarged several tens of pixels vertically and horizontally.

データ処理装置２００Ａは、ブロッキング輝度伸張部３２０を含む。ブロッキング輝度伸張部３２０は、ブロッキング輝度伸張処理部３２０は、注目画像（たとえば顔を含む領域を矩形状に抽出した顔矩形画像）に対して画素単位のブロッキング輝度伸張処理を施し画素レベルで明暗の補正を実施する。このブロッキング輝度伸張処理部３２０によるブロッキング輝度伸張処理により、監視対象の顔矩形画像に対して明暗の大きく異なる部分が画素レベルで補正され、監視対象の顔矩形画像が監視し易いダイナミックレンジの画像に補正される。このブロッキング輝度伸張処理技術の詳細については同一出願人による特許第５４８６９６３号、特許第５８６６１４６号、特許第５８８７０６７号に開示されている。 The data processing device 200A includes a blocking luminance expansion unit 320. The blocking luminance expansion unit 320 performs a pixel-based blocking luminance expansion process on the target image (for example, a face rectangular image obtained by extracting a region including a face in a rectangular shape), and performs a bright and dark pixel level. Make corrections. By the blocking luminance expansion processing by the blocking luminance expansion processing unit 320, a greatly different portion of brightness and darkness is corrected at the pixel level with respect to the face rectangular image to be monitored, so that the face rectangular image to be monitored becomes an image having a dynamic range that is easy to monitor. It is corrected. The details of this blocking luminance expansion processing technique are disclosed in Japanese Patent Nos. 5486963, 5866146, and 5888670 by the same applicant.

表示デバイス３２１では、データ処理装置２００Ａから取り込まれた全方位監視画像が表示され、この全方位監視画像の中に、追跡処理した顔矩形画像が表示される。これによりユーザは、撮像領域内の注目画像（顔画像）の移動位置を容易に確認することができる。 The display device 321 displays an omnidirectional monitoring image captured from the data processing apparatus 200A, and a tracking face rectangular image is displayed in the omnidirectional monitoring image. Thus, the user can easily confirm the movement position of the target image (face image) in the imaging region.

図２は、図１の構成をさらに具体的に示すブロック図を示している。 FIG. 2 is a block diagram showing the configuration of FIG. 1 more specifically.

画像バッファメモリ２０２の全方位高精細監視画像（広域画像データと称してもよい）は、リサイズ部１１によりＶＧＡ：６４０×４８０ピクセルの解像度を持つ画像に変換され、先のブロッキング輝度伸張部３２０に入力される。 The omnidirectional high-definition monitoring image (which may be referred to as wide-area image data) in the image buffer memory 202 is converted into an image having a resolution of VGA: 640 × 480 pixels by the resizing unit 11, and the above-described blocking luminance expansion unit 320 receives it. Entered.

また画像バッファメモリ２０２の全方位高精細監視画像（広域画像データ）は、第１の画像処理部Ａ１内のリサイズ部１１により、最大１９２０×１０８０ピクセルの解像度を持つ画像に変換され、パノラマ画像生成部２０３に入力される。 Further, the omnidirectional high-definition monitoring image (wide-area image data) in the image buffer memory 202 is converted into an image having a maximum resolution of 1920 × 1080 pixels by the resizing unit 11 in the first image processing unit A1 to generate a panoramic image. Input to the unit 203.

パノラマ画像生成部２０３に入力された画像は、全周囲を正面画像化したパノラマ画像に変換される。この場合、楕円状の画像領域が例えば上下に２分割されて、第１パノラマ画像、第２パノラマ画像として表示される。これらのパノラマ画像は、先のリサイズにより、最大１９２０×１０８０ピクセルの解像度を持つ画像であり、全方位高精細監視画像のピクセル数よりも少なく、データ処理の負担を軽くしている。 The image input to the panorama image generation unit 203 is converted into a panorama image in which the entire periphery is converted into a front image. In this case, the elliptical image area is divided into, for example, two vertically and displayed as the first panorama image and the second panorama image. These panoramic images are images having a maximum resolution of 1920 × 1080 pixels due to the previous resizing, and are smaller than the number of pixels of the omnidirectional high-definition monitoring image, thereby reducing the burden of data processing.

パノラマ画像生成部２０３のパノラマ画像は、画像追跡処理部２０４にて、被検知物あるいは注目画像（例えば人物の顔）を検出する顔検知処理、検知した顔の追跡処理を行うための材料となる。 The panorama image of the panorama image generation unit 203 is a material for performing a face detection process for detecting an object to be detected or an attention image (for example, a human face) and a tracking process for the detected face in the image tracking processing unit 204. .

画像追跡処理部２０４は、人物の顔を検出すると、その顔画像情報、属性情報（推定年齢、性別、マスクの有無、メガネの有無など）、検知した顔の領域を矩形で囲む顔矩形座標情報を、オブジェクト情報バッファ２０６に送信する。 When the image tracking processing unit 204 detects the face of a person, the face image information, attribute information (estimated age, sex, presence / absence of mask, presence / absence of glasses, etc.), face rectangle coordinate information surrounding the detected face area with a rectangle Is transmitted to the object information buffer 206.

属性情報は、顔画像情報から抽出される。しかしこれに限らず、属性情報は、推定年齢、性別、マスクの有無、メガネの有無、人物の洋服、洋服の形、アクセサリー、色、髪型、など各種の情報が取得されて、当該人物の識別子（ＩＤ）と関連つけられてもよい。 The attribute information is extracted from the face image information. However, the present invention is not limited to this, and the attribute information includes various information such as estimated age, gender, presence / absence of mask, presence / absence of glasses, person's clothes, clothes shape, accessories, color, hairstyle, etc. (ID) may be associated.

検出された顔画像情報はバッファメモリ２０５ａへ格納され、属性情報はバッファメモリ２０５ｂへ格納され、検知した顔の領域（顔を含む領域）を示す顔矩形座標情報はバッファメモリ２０５ｃに格納される。 The detected face image information is stored in the buffer memory 205a, the attribute information is stored in the buffer memory 205b, and face rectangle coordinate information indicating the detected face area (area including the face) is stored in the buffer memory 205c.

ここで、検知した顔の領域を示す顔矩形座標情報は、パノラマ画像から取得したものである。また、顔矩形座標情報により囲まれる領域の顔画像は、最大１９２０×１０８０ピクセルの解像度である。したがって、この段階での顔矩形座標情報、解像度は、画像バッファメモリ２０２内の全方位高精細監視画像に対する顔矩形座標情報、解像度とは一致しない。そこで、座標変換処理部２０６は顔画像がバッファメモリ２０５に記憶されると、パノラマ画像上の顔画像位置座標を全方位高精細監視画像の顔画像位置座標に変換する。次に、検知矩形座標情報生成部２０７が、座標変換処理部２０６により生成された全方位高精細監視画像の顔画像位置座標をもとに顔検知矩形座標情報（顔を囲むフレームの座標情報）を生成する。この顔検知矩形座標情報は、オブジェクト追跡処理部３１４を介してあるいは直接注目領域画像作成部３１５に与えられる。オブジェクト追跡処理部３１４は、第１の画像処理部Ａ１又は第２の画像処理部Ａ２から出力される何れかの顔或いは動体検知矩形座標情報を、状況に応じて選択し、注目領域画像作成部３１５に与えることができる。 Here, the face rectangle coordinate information indicating the detected face area is obtained from the panoramic image. In addition, the face image in the area surrounded by the face rectangle coordinate information has a resolution of 1920 × 1080 pixels at the maximum. Therefore, the face rectangular coordinate information and resolution at this stage do not match the face rectangular coordinate information and resolution for the omnidirectional high-definition monitoring image in the image buffer memory 202. Therefore, when the face image is stored in the buffer memory 205 , the coordinate conversion processing unit 206 converts the face image position coordinates on the panoramic image into the face image position coordinates of the omnidirectional high-definition monitoring image. Next, the detection rectangular coordinate information generation unit 207 performs face detection rectangular coordinate information (coordinate information of the frame surrounding the face) based on the face image position coordinates of the omnidirectional high-definition monitoring image generated by the coordinate conversion processing unit 206. Is generated. The face detection rectangular coordinate information is given to the attention area image creation unit 315 via the object tracking processing unit 314 or directly. The object tracking processing unit 314 selects any face or moving body detection rectangular coordinate information output from the first image processing unit A1 or the second image processing unit A2 according to the situation, and an attention area image generation unit 315.

今、注目領域画像作成部３１５が、第１の画像処理部Ａ１からの顔検知矩形座標情報を選択するものとする。すると、注目領域画像作成部３１５は検知矩形座標情報生成部２０７が生成した顔検知矩形座標情報をもとに、画像バッファメモリ２０２に貯えられた全方位高精細監視画像から画像処理部２０４が検出した顔画像（顔矩形画像或いは注目画像と称す）に対応する高精細な顔矩形画像を切り出す。そして、この全方位高精細監視画像から切り出した高精細な顔矩形画像を補正処理する。補正処理は、歪のある画像を表示デバイス３２１のアスペクト（４：３或いは１６：９）の画像に正規化することである。この補正処理された顔矩形画像は、オブジェクト画像バッファメモリ３１６に貯えられる。 Now, it is assumed that the attention area image creation unit 315 selects the face detection rectangular coordinate information from the first image processing unit A1. Then, the attention area image creation unit 315 is detected by the image processing unit 204 from the omnidirectional high-definition monitoring image stored in the image buffer memory 202 based on the face detection rectangular coordinate information generated by the detection rectangular coordinate information generation unit 207. A high-definition face rectangular image corresponding to the face image (referred to as a face rectangular image or a noticed image) is cut out. Then, a high-definition face rectangular image cut out from the omnidirectional high-definition monitoring image is corrected. The correction process is to normalize a distorted image into an image with an aspect (4: 3 or 16: 9) of the display device 321. The corrected face rectangular image is stored in the object image buffer memory 316.

なお検知矩形座標情報生成部２０７が生成した顔検知矩形座標情報は、画像バッファメモリ２０２において、注目領域の顔を囲む検知対象の追跡や検知結果画像を生成するためのデータとして利用される。またこの顔検知矩形座標情報は、第２の画像処理部Ａ２において参照されてもよい。 Note that the face detection rectangular coordinate information generated by the detection rectangular coordinate information generation unit 207 is used as data for tracking the detection target surrounding the face of the attention area and generating a detection result image in the image buffer memory 202. The face detection rectangular coordinate information may be referred to in the second image processing unit A2.

上記のオブジェクト画像バッファメモリ３１６に貯えられた高精細な顔矩形画像はリサイズ部１４により表示デバイス３２１の画素構成に合った画面サイズにリサイズされブロッキング輝度伸張処理部３２０に送出される。このブロッキング輝度伸張処理部３２０によりブロッキング輝度伸張処理された顔矩形画像は監視画像として表示デバイス３２１に表示される。 The high-definition face rectangular image stored in the object image buffer memory 316 is resized to a screen size suitable for the pixel configuration of the display device 321 by the resizing unit 14 and sent to the blocking luminance expansion processing unit 320. The face rectangular image subjected to the blocking luminance expansion processing by the blocking luminance expansion processing unit 320 is displayed on the display device 321 as a monitoring image.

表示デバイス３２１に表示された顔矩形画像は、全方位高精細監視画像から切り出されてブロッキング輝度伸張処理により監視し易いダイナミックレンジの画像に補正された高密度ベースの監視画像である。 The face rectangular image displayed on the display device 321 is a high-density-based monitoring image that is cut out from the omnidirectional high-definition monitoring image and corrected to an image with a dynamic range that can be easily monitored by blocking luminance expansion processing.

上記したように、監視対象の顔矩形画像は、監視し易いダイナミックレンジで高品位の画像として、表示デバイス３２１に表示される（図１０の３０３参照）。 As described above, the face rectangular image to be monitored is displayed on the display device 321 as a high-quality image with a dynamic range that is easy to monitor (see 303 in FIG. 10).

さらに、画像バッファメモリ２０２に格納されている全方位高精細監視画像は、リサイズ部１３により、表示デバイス３２１の画面に合わせたサイズにリサイズされる。このリサイズされた全方位監視画像は、例えば６４０×４８０画素の解像度である。この全方位監視画像は、ブロッキング輝度伸張部３２０で輝度伸張処理されて、先の顔矩形画像に重なるように合成される。この全方位高精細監視画像は、追跡している対象物（この実施形態では人物の顔）が全方位監視領域の中で、どの位置へ移動したかを容易に判断するのに有効である（図１０の３０２参照）。 Furthermore, the omnidirectional high-definition monitoring image stored in the image buffer memory 202 is resized by the resizing unit 13 to a size that matches the screen of the display device 321. The resized omnidirectional monitoring image has a resolution of, for example, 640 × 480 pixels. This omnidirectional monitoring image is subjected to luminance expansion processing by the blocking luminance expansion unit 320 and is synthesized so as to overlap the previous face rectangular image. This omnidirectional high-definition monitoring image is effective for easily determining to which position the object being tracked (in this embodiment, a human face) has moved in the omnidirectional monitoring region ( (See 302 in FIG. 10).

また上記のオブジェクト情報バッファ２０５に格納されている顔画像及び属性情報などは、表示デバイス３２１に入力されて、付加的に表示されてもよい（図１０の３１１、３１２参照）。また、顔画像情報はバッファメモリ２０５ａへ格納される顔画像情報、バッファメモリ２０５ｂへ格納される属性情報（推定年齢、性別、マスクの有無、メガネの有無など）は、互いにリンクした形で、オブジェクト情報バッファ２０５に蓄積される。これらの蓄積データは、その後の顔画像認識処理、追跡処理の際に利用することができる。 The face image and attribute information stored in the object information buffer 205 may be input to the display device 321 and additionally displayed (see 311 and 312 in FIG. 10). The face image information includes face image information stored in the buffer memory 205a, and attribute information stored in the buffer memory 205b (estimated age, sex, presence / absence of mask, presence / absence of glasses, etc.) are linked to each other in the object Accumulated in the information buffer 205. These accumulated data can be used in subsequent face image recognition processing and tracking processing.

図２の構成において、画像バッファメモリ２０２、リサイズ部１１、パノラマ画像生成部２０３は、視野角１８０°以上により空間を撮像して得た広角視野監視画像を、ビット数を低減させ且つリサイズし、少なくとも２つに分けて第１パノラマ画像、第２パノラマ画像に変換する第１の変換手段と称することができる。また、画像処理部２０４、オブジェクト情報バッファ２０５は、前記第１パノラマ画像若しくは第２パノラマ画像から少なくとも注目被写体を検知して追跡する検知及び追跡手段と称することができる。 In the configuration of FIG. 2, the image buffer memory 202, the resizing unit 11, and the panoramic image generating unit 203 reduce the number of bits and resize the wide-angle visual field monitoring image obtained by imaging the space with a visual angle of 180 ° or more, It can be referred to as a first conversion means for converting the image into at least two and converting it into a first panorama image and a second panorama image. The image processing unit 204 and the object information buffer 205 can be referred to as detection and tracking means for detecting and tracking at least a subject of interest from the first panorama image or the second panorama image.

さらに、座標変換処理部２０６、検知矩形座標情報生成部２０７、注目領域画像作成部３１５、オブジェクト画像バッファメモリ３１６は、前記注目被写体のパノラマ画像上の座標データを、前記広角視野監視画像上の座標データに変換して、前記広角視野監視画像から前記注目被写体の注目画像を切出す注目画像切出し手段と称することができる。そして、リサイズ部１４、ブロッキング輝度伸張部３２０、表示デバイス３２１等は、少なくとも前記注目画像切出し手段が切り出した前記注目画像を表示する表示手段と称することができる。 Furthermore, the coordinate conversion processing unit 206, the detection rectangular coordinate information generation unit 207, the attention area image creation unit 315, and the object image buffer memory 316 convert the coordinate data on the panoramic image of the subject of interest into the coordinates on the wide-angle visual field monitoring image. It can be referred to as attention image cutout means for converting into data and cutting out the attention image of the subject of interest from the wide-angle visual field monitoring image. The resizing unit 14, the blocking luminance expansion unit 320, the display device 321 and the like can be referred to as a display unit that displays at least the target image cut out by the target image cutout unit.

本装置は、さらに第２の画像処理部Ａ２を有する。画像バッファメモリ２０２の全方位高精細監視画像は、リサイズ部１２により３２０×２４０ピクセルの解像度を持つ画像に変換され、画像追跡処理部３１１に入力される。この画像追跡処理部３１１は、画像の中の動体を検出して追跡し、その動体検知情報を生成する。動体検知情報は、動体検知矩形座標情報部３１２において、追跡中の動体を囲む矩形座標情報（動体検知矩形座標情報）に変換される。この動体検知矩形座標情報は、画像バッファメモリ２０２において、動きのある注目領域を囲む検知対象の追跡や検知結果画像を生成するためのデータとして利用される。つまり、画像バッファメモリ２０２の全方位高精細監視画像から、動きを伴う動体画像を切り出し、オブジェクト画像バッファメモリ３１６に格納する際にこの動体検知矩形座標情報が利用される。 The apparatus further includes a second image processing unit A2. The omnidirectional high-definition monitoring image in the image buffer memory 202 is converted into an image having a resolution of 320 × 240 pixels by the resizing unit 12 and input to the image tracking processing unit 311. The image tracking processing unit 311 detects and tracks a moving object in the image, and generates moving object detection information. The moving object detection information is converted in the moving object detection rectangular coordinate information unit 312 into rectangular coordinate information (moving object detection rectangular coordinate information) surrounding the moving object being tracked. This moving object detection rectangular coordinate information is used in the image buffer memory 202 as data for tracking a detection target surrounding a moving attention area and generating a detection result image. That is, the moving body detection rectangular coordinate information is used when a moving body image with movement is cut out from the omnidirectional high-definition monitoring image in the image buffer memory 202 and stored in the object image buffer memory 316.

この場合も注目領域画像作成部３１５は、全方位高精細監視画像から切り出した高精細な顔矩形画像を補正処理する。補正処理は、歪のある画像を表示デバイス３２１のアスペクト（４：３或いは１６：９）の画像に正規化することである。この補正処理された顔矩形画像は、オブジェクト画像バッファメモリ３１６に貯えられる。 Also in this case, the attention area image creation unit 315 corrects a high-definition face rectangular image cut out from the omnidirectional high-definition monitoring image. The correction process is to normalize a distorted image into an image with an aspect (4: 3 or 16: 9) of the display device 321. The corrected face rectangular image is stored in the object image buffer memory 316.

上記した処理により、本装置は、追跡する対象物（例えば人物等）の追跡能力を向上した監視画像処理装置及び監視画像処理方法を提供することができる。 Through the processing described above, the present apparatus can provide a monitoring image processing apparatus and a monitoring image processing method that improve the tracking ability of an object to be tracked (for example, a person).

図３は、上記した装置の構成とその動作を概略的に示す説明図である。データ処理装置２００Ａは、撮像部１０１からの高精細監視画像をストリーム受信部により受信する（Ｓ４０１）。ストリーム受信部で受信された高精細監視画像は、キャプチャ部によりフレーム単位でキャプチャされる（Ｓ４０３）。尚、ストリーム受信部で受信された高精細監視画像は、記憶媒体に一旦記憶された後、或は記憶されつつ（Ｓ４０２）、フレーム単位でキャプチャされてもよい（Ｓ４０３）。 FIG. 3 is an explanatory diagram schematically showing the configuration and operation of the above-described apparatus. The data processing device 200A receives the high-definition monitoring image from the imaging unit 101 by the stream receiving unit (S401). The high-definition monitoring image received by the stream receiving unit is captured by the capture unit in units of frames (S403). Note that the high-definition monitoring image received by the stream receiving unit may be captured in units of frames (S403) after being temporarily stored in the storage medium or being stored (S402).

次にデータ処理装置２００Ａ内では、次のような処理が実行される。即ち、キャプチャされた高精細監視画像は前処理される（Ｓ４０４）。この前処理は、図２で説明したリサイズ、パノラマ画像生成に対応する。前処理されたパノラマ画像は、画像処理による顔検知に基づく追跡処理及び動体検知に基づく追跡処理の対象となる（Ｓ４０４，Ｓ４０５）。顔検知に基づく追跡処理において得られた情報は、顔矩形画像、属性情報、顔矩検知形座標情報などを含む。パノラマ画像の顔検知矩形座標情報は、全方位高精細画像の座標に対応した検知矩形座標情報に座標変換される。また属性情報も抽出される。 Next, the following processing is executed in the data processing device 200A. That is, the captured high-definition monitoring image is preprocessed (S404). This preprocessing corresponds to the resizing and panoramic image generation described in FIG. The preprocessed panoramic image is subjected to tracking processing based on face detection by image processing and tracking processing based on motion detection (S404, S405). Information obtained in the tracking process based on face detection includes a face rectangle image, attribute information, face rectangle detection coordinate information, and the like. The face detection rectangular coordinate information of the panoramic image is coordinate-converted to detection rectangular coordinate information corresponding to the coordinates of the omnidirectional high-definition image. Also, attribute information is extracted.

さらに本装置では、顔検知に基づく追跡処理の他に、第２の画像処理部Ａ２において、動体検知に基づく追跡処理も実行されている。この追跡処理においても、注目画像を切出すための動体検知矩形座標情報が生成される（Ｓ４０６）。この動体検知矩形座標情報は、顔検知を行う第１の画像処理装置Ａ１において参照することができる。この参照により、第１の画像処理装置Ａ１は、注目画像を仮想的に追跡することができ、実際の顔検知を行った時から、顔判断までの時間を早くすることができる。つまり第１の画像処理装置Ａ１は、顔検知能力がロバストになる。 Further, in this apparatus, in addition to the tracking process based on the face detection, the second image processing unit A2 also executes a tracking process based on the moving object detection. Also in this tracking process, moving object detection rectangular coordinate information for cutting out the target image is generated (S406). This moving object detection rectangular coordinate information can be referred to in the first image processing apparatus A1 that performs face detection. With this reference, the first image processing apparatus A1 can virtually track the target image and can shorten the time from the actual face detection to the face determination. That is, the first image processing apparatus A1 has a robust face detection capability.

上記のようにオブジェクト追跡処理が実施される（Ｓ４０７）。次に注目画像出力部Ａ３において、第１の矩形情報（顔検知矩形座標情報）又は第２の矩形情報（動体検知矩形座標情報）又は第１の矩形情報のみに基づいて、広域画像データから注目画像が切り出され、切出された注目画像の補正処理が行われる。そして補正後の注目画像がブロッキング輝度伸張処理を受けて（Ｓ４０９）表示デバイス３２１に供給される。上記したように顔検知或は動体検知により、オブジェクト（注目画像）の追跡処理が実施される。 The object tracking process is performed as described above (S407). Next, in the attention image output unit A3, attention is drawn from the wide area image data based only on the first rectangular information (face detection rectangular coordinate information), the second rectangular information (moving object detection rectangular coordinate information), or the first rectangular information. The image is cut out, and correction processing of the cut out attention image is performed. The corrected attention image is subjected to blocking luminance expansion processing (S409) and supplied to the display device 321. As described above, tracking processing of an object (attention image) is performed by face detection or moving object detection.

ここで、顔検知に基づく追跡処理に基づく顔検知矩形座標情報、動体検知に基づく追跡処理に基づく動体検知矩形座標情報の何れか一方が注目画像の切出し処理のために利用されるが、その基本的な状態遷移について、以下に説明する。この例は一例であり、必ずしもこの状態遷移に限定されるものではない。 Here, one of face detection rectangular coordinate information based on tracking processing based on face detection and moving body detection rectangular coordinate information based on tracking processing based on moving object detection is used for extracting the target image. A typical state transition will be described below. This example is an example and is not necessarily limited to this state transition.

図４は、本実施形態の監視画像処理装置において、顔検知による追跡処理と動体検知による追跡処理が行われる際の状態遷移ルートの一例を示す図である。 FIG. 4 is a diagram illustrating an example of a state transition route when a tracking process based on face detection and a tracking process based on motion detection are performed in the monitoring image processing apparatus according to the present embodiment.

今、顔検知・オブジェクト検知動作状態Ｊ１にあるものとする。ここで顔（注目画像）が検知されると、パスＰ１を介して、顔矩形による追尾動作状態Ｊ２に移る。顔が検知されている限り（パスＰ２）、この追尾動作状態Ｊ２が維持される。 Assume that the face detection / object detection operation state J1 is in effect. Here, when a face (attention image) is detected, the process proceeds to a tracking operation state J2 with a face rectangle via a path P1. As long as a face is detected (path P2), this tracking operation state J2 is maintained.

状態Ｊ２において、顔が検知されなくなると、パスＰ３を介してロスト動作状態Ｊ３に移る。ロスト動作状態Ｊ３にあっても、顔が検知されると顔矩形による追尾動作状態Ｊ２に移る。しかし、ロスト動作状態Ｊ３にあって、顔が検知されず、オブジェクト（動体）が検出された場合は、動体・停止物矩形による追尾動作状態Ｊ４に移る。この状態Ｊ４では、動体を検出したあと、その動体が一定期間停止していても、動体検知として判断し、動体追尾動作を維持することができる。 When the face is not detected in the state J2, the process proceeds to the lost operation state J3 via the path P3. Even in the lost operation state J3, when a face is detected, the operation proceeds to the tracking operation state J2 by the face rectangle. However, in the lost motion state J3, when a face is not detected and an object (moving body) is detected, the operation moves to a tracking motion state J4 using a moving body / stopping object rectangle. In this state J4, after detecting a moving object, even if the moving object has stopped for a certain period, it is determined as moving object detection, and the moving object tracking operation can be maintained.

動体・停止物矩形による追尾動作状態Ｊ４において、顔検知があった場合は、パスＰ６を経由して顔矩形による追尾動作状態Ｊ２に移る。しかし、動体・停止物矩形による追尾動作状態Ｊ４において、動体（オブジェクト）検知もできなくなった場合、パスＰ７を介して、ロスト動作状態Ｊ３に移る。 If a face is detected in the tracking operation state J4 using the moving object / stop object rectangle, the process proceeds to the tracking operation state J2 using the face rectangle via the path P6. However, when the moving object (object) cannot be detected in the tracking operation state J4 using the moving object / stopping object rectangle, the process proceeds to the lost operation state J3 via the path P7.

ロスト動作状態Ｊ３は、パスＰ８を経由して一定時間が経過するまでは、ロスト動作状態Ｊ３を維持する。しかし顔・オブジェクトが検知されないまま一定時間が経過してしまうと、パスＰ９を経由して、顔検知・オブジェクト検知動作状態Ｊ１に移る。 The lost operation state J3 maintains the lost operation state J3 until a predetermined time passes through the path P8. However, if a certain time elapses without the face / object being detected, the process proceeds to the face detection / object detection operation state J1 via the path P9.

顔検知・オブジェクト検知動作状態Ｊ１においては、顔検知動作、動体検知動作の両方が平行して実施される。顔検知のために、動体検知による矩形情報が座標変換（つまりパノラマ座標系に変換）されて顔検知矩形情報として利用されてもよい。この利用により、顔検知が比較的早めに達成される。 In the face detection / object detection operation state J1, both the face detection operation and the moving object detection operation are performed in parallel. For face detection, rectangle information obtained by moving object detection may be coordinate-converted (that is, converted into a panoramic coordinate system) and used as face detection rectangle information. This use achieves face detection relatively early.

またこの場合、顔検知のための顔検知矩形の面積が、追跡処理時の顔矩形の面積よりも数パーセント拡張されて利用される。これにより、顔の左右或いは上下の動きがあっても顔検知を容易化することができる。 Further, in this case, the area of the face detection rectangle for face detection is used by being expanded by several percent than the area of the face rectangle during the tracking process. Thereby, face detection can be facilitated even if the face moves left and right or up and down.

なおロスト動作状態Ｊ３から、顔検知・オブジェクト検知動作状態Ｊ１に移って、顔を検知した場合、今回、検知した顔と同一の顔を過去フレームにおいて検知したことがあるかどうか判定する。同一の顔が存在した場合、当該顔画像データに対応する同一顔の検知データ（例えば顔検知矩形座標情報、顔の確からしさのスコア値、検知回数、最新検知フレーム番号、追跡フレーム数、前回検知からのフレーム間隔等）を更新する。過去の顔検知において、同一顔が検知されていない場合には、新たな顔検知データ（例えば初回検知フレーム番号、顔ID情報、顔検知矩形座標情報、顔の確からしさのスコア値）を作成する。つまり今まで追跡していた顔（オブジェクト：動体）に付けていた識別データ（ＩＤ）は、一旦破棄され、状態Ｊ１で検出した新たな顔（オブジェクト）に対しては新たなＩＤが付されて管理される。 If the face detection is performed from the lost operation state J3 to the face detection / object detection operation state J1, it is determined whether or not the same face as the detected face has been detected in the past frame. If the same face exists, detection data of the same face corresponding to the face image data (for example, face detection rectangular coordinate information, face probability score value, number of detections, latest detection frame number, number of tracking frames, previous detection Update the frame interval etc.). If the same face is not detected in past face detection, new face detection data (for example, initial detection frame number, face ID information, face detection rectangular coordinate information, score value of face likelihood) is created. . In other words, the identification data (ID) attached to the face (object: moving object) that has been tracked up to now is temporarily discarded, and a new ID is assigned to the new face (object) detected in the state J1. Managed.

図５は、本実施形態に関わる装置の概略動作を記載した動作フローチャートである。全方位高精細監視画像の処理が開始される。まず顔検知に基づいて顔矩形画像を追尾する追尾処理が実行される（ＳＡ１）。顔矩形画像の追尾がロスト状態になると（ＳＡ２）、オブジェクト（動体）検知があるかどうかの判定がなされる（ＳＡ３）。オブジェクト検知があった場合は、動体・停止矩形による追尾処理が実行される（ＳＡ４）。 FIG. 5 is an operation flowchart describing a schematic operation of the apparatus according to the present embodiment. Processing of the omnidirectional high-definition monitoring image is started. First, tracking processing for tracking a face rectangular image based on face detection is executed (SA1). When the tracking of the face rectangular image is lost (SA2), it is determined whether or not there is object (moving object) detection (SA3). When the object is detected, the tracking process using the moving object / stop rectangle is executed (SA4).

ロスト動作状態とオブジェクト検知動作状態、において、何らかの検知（顔検知）があった場合は、図４で説明したように状態Ｊ２状態に移る。 If any detection (face detection) is detected in the lost motion state and the object detection motion state, the state moves to the state J2 as described with reference to FIG.

図６と図７は、顔矩形画像を追尾する追尾処理（図５のＳＡ１）をさらに具体的示すフローチャートである。図８は、ロスト状態から動体を検知し追跡処理（図５のＳＡ４）する場合の動作フローを詳しく示している。 6 and 7 are flowcharts showing more specifically the tracking process (SA1 in FIG. 5) for tracking the face rectangular image. FIG. 8 shows in detail an operation flow when a moving object is detected and tracked (SA4 in FIG. 5) from the lost state.

追尾処理は、「追尾無し」、「追尾中」、「ロスト中」の３種類の状態で管理する。ロスト中は、規定時間連続してロスト状態であるかどうかを監視し、規定時間連続してロスト状態が続けば、「追尾無し」状態とする。 The tracking process is managed in three states: “no tracking”, “tracking”, and “lost”. During the lost, it is monitored whether or not the lost state continues for a specified time. If the lost state continues for a specified time, a “no tracking” state is set.

まず、直前の追尾状態が「追尾無し」であったかどうかの判断を行う（ＳＢ１）。「追尾無し」であった場合は、顔検知に基づく新規顔検知矩形情報が存在するか否かの判定を行う。新規顔検知矩形情報が存在した場合、この新規顔検知矩形情報を現在追尾位置矩形情報として格納する。 First, it is determined whether or not the immediately preceding tracking state is “no tracking” (SB1). If it is “no tracking”, it is determined whether or not new face detection rectangle information based on face detection exists. If there is new face detection rectangle information, the new face detection rectangle information is stored as the current tracking position rectangle information.

次に、現在追尾位置矩形情報の取得（又は作成）が成功したかどうかを判定し（ＳＢ４）、成功してれば追跡中のオブジェクトＩＤを所定のテーブルに格納する（ＳＢ１０）。 Next, it is determined whether or not the acquisition (or creation) of the current tracking position rectangle information is successful (SB4), and if successful, the object ID being tracked is stored in a predetermined table (SB10).

先のＳＢ１の「追尾無し」判定において、追尾中の判定があった場合、既存追尾位置矩形情報に対応するオブジェクトＩＤと同一のオブジェクトＩＤの顔検知矩形情報を取得する（ＳＢ５）。次に、取得した顔検知矩形情報に基づいて取得した顔画像と同一顔画像が所定テーブルに存在するかどうかを判定する（ＳＢ６）。 If it is determined that tracking is in progress in the previous “no tracking” determination in SB1, face detection rectangular information having the same object ID as the object ID corresponding to the existing tracking position rectangular information is acquired (SB5). Next, it is determined whether or not the same face image as the acquired face image exists in the predetermined table based on the acquired face detection rectangle information (SB6).

ここで同一顔画像が存在した場合は、現在追尾位置矩形情報の取得（又は作成）が成功したかどうかを判定し（ＳＢ４）、成功してれば追跡中のオブジェクトＩＤを所定のテーブルに格納する（ＳＢ１０）。つまり、その後は、同一のオブジェクトＩＤの顔画像が追跡されることになる。 If the same face image exists, it is determined whether or not the acquisition (or creation) of the current tracking position rectangle information is successful (SB4). If successful, the object ID being tracked is stored in a predetermined table. (SB10). That is, after that, face images with the same object ID are tracked.

先のＳＢ６において、同一顔画像が存在しないことが判定された場合は、既存追尾位置矩形情報を調整して、矩形位置を左右、上下指定して、サイズを拡大し、拡大した既存追尾位置矩形情報と重なる顔検知矩形情報を取得する。つまりここでは、同じ顔でも少しずれた位置に顔が移動している場合があったり、或は、既存の顔の近くに別の顔が存在する場合がある。そこで、新たに顔が認識された場合は、その顔画像を利用してその後の追尾処理を行う。これにより追跡処理が強化される。 If it is determined in the previous SB6 that the same face image does not exist, the existing tracking position rectangle information is adjusted, the rectangle position is designated left and right, up and down, the size is expanded, and the enlarged existing tracking position rectangle Acquire face detection rectangle information that overlaps the information. In other words, here, the face may be moved to a slightly shifted position even for the same face, or another face may exist near the existing face. Therefore, when a new face is recognized, the subsequent tracking process is performed using the face image. This enhances the tracking process.

追跡中のオブジェクトＩＤは、所定のテーブルに格納され（ＳＢ１０）、追尾が実行される。次に再度追尾状態が「追尾無し」であったかどうかの判定がなされる（ＳＢ１１＝ＳＢ１の判定と同じ判定）。追尾無しであった場合は、追尾位置矩形情報に現在追尾位置矩形情報（ＳＢ４で得られた情報）を格納し、追尾状態を「追尾中」に確定する（ＳＢ１２、ＳＢ１３）。追尾状態を「追尾中」に確定した後は、図７に示すように追尾解除時間をリセットし（ＳＢ２１）、終了する。 The object ID being tracked is stored in a predetermined table (SB10), and tracking is executed. Next, it is determined again whether or not the tracking state is “no tracking” (the same determination as the determination of SB11 = SB1). If there is no tracking, the current tracking position rectangle information (information obtained in SB4) is stored in the tracking position rectangle information, and the tracking state is determined as “tracking” (SB12, SB13). After the tracking state is confirmed as “tracking”, the tracking release time is reset (SB21) as shown in FIG.

ＳＢ１１において、追尾中であったことの判定がなされた場合は、現在追尾位置矩形情報（ＳＢ４で取得した情報）に重みを乗じた位置情報を、既存追尾位置矩形情報（追尾中であったときの情報）に加算して、追尾位置矩形情報を更新する。この処理は、例えば１秒間に１０コマ或いは５コマごとに実行され、既存追尾位置矩形情報に例えば０．３の重み、現在追尾位置矩形情報に例えば０．７の重みが乗じられる。これにより、追尾位置矩形情報から現在追尾位置矩形情報への滑らかな変化が得られる。 When it is determined in SB11 that tracking is in progress, position information obtained by multiplying the current tracking position rectangular information (information acquired in SB4) by a weight is used as existing tracking position rectangular information (in tracking). The tracking position rectangle information is updated. This process is executed, for example, every 10 frames or 5 frames per second, and the existing tracking position rectangle information is multiplied by, for example, 0.3 weight, and the current tracking position rectangle information is multiplied by, for example, 0.7 weight. Thereby, a smooth change from the tracking position rectangular information to the current tracking position rectangular information is obtained.

ところで、ＳＢ４において、現在追尾位置矩形情報が取得できない場合がある。この場合は、図７のＳＢ２２に示すように、追尾状態が「追尾中」であったかどうかの判定を行う。「追尾中」であった場合は、追尾状態を「ロスト中」に設定し（ＳＢ２３）、追尾解除経過時間を算出する（ＳＢ２４）。ＳＢ２２の判定で、追尾状態が「追尾中」ではなかった場合、つまり「ロスト」中であった場合は、追尾解除経過時間を算出する（ＳＢ２４）。 Incidentally, in SB4, the current tracking position rectangle information may not be acquired. In this case, as shown in SB22 of FIG. 7, it is determined whether or not the tracking state is “tracking”. If it is “tracking”, the tracking state is set to “lost” (SB23), and the tracking release elapsed time is calculated (SB24). If it is determined in SB22 that the tracking state is not “tracking”, that is, “lost”, the tracking cancellation elapsed time is calculated (SB24).

そして、追尾解除経過時間が規定時間を超過した場合は、追尾状態を「追尾無し」にして終了し、追尾解除経過時間が規定時間を超過していない場合は、終了する。 If the tracking cancellation elapsed time exceeds the specified time, the tracking state is set to “no tracking” and the processing ends. If the tracking cancellation elapsed time does not exceed the specified time, the processing ends.

図８は、ロスト状態から動体を検知した場合（図５のＳＡ４）の動作フローを詳しく示している。 FIG. 8 shows in detail the operation flow when a moving object is detected from the lost state (SA4 in FIG. 5).

既存の追尾位置矩形情報のオブジェクトＩＤと同一オブジェクトＩＤの動体・停止物矩形情報の取得を試みる（ＳＣ１）。同一オブジェクトが存在した場合（ＳＣ２のＹｅｓ）、前記同一オブジェクトに対応する現在追尾位置矩形情報の取得に成功しているかどうかの判定を行う（ＳＣ４）。同一オブジェクトが存在しなかった場合（ＳＣ２のＮｏ）、既存追尾位置矩形情報を左右、上下、指定したサイズ分拡大し、拡大した既存追尾位置矩形情報と重なる動体・停止物検知矩形情報を取得する（ＳＣ３）。 An attempt is made to acquire moving object / stop object rectangle information having the same object ID as the object ID of the existing tracking position rectangle information (SC1). If the same object exists (Yes in SC2), it is determined whether or not the current tracking position rectangle information corresponding to the same object has been successfully acquired (SC4). When the same object does not exist (No in SC2), the existing tracking position rectangle information is enlarged by the specified size to the left, right, up, down, and the moving object / stop object detection rectangle information that overlaps the enlarged existing tracking position rectangle information is acquired. (SC3).

次に、現在追尾位置矩形情報の取得（又は作成）が成功したかどうかを判定し（ＳＣ４）、成功してれば追跡中のオブジェクトＩＤを所定のテーブルに格納する（ＳＣ５）。次に現在追尾位置矩形情報に重みを乗じた位置を既存追尾位置矩形情報に加算して、新たな矩形情報を作成（更新）する（ＳＣ６）。この場合も、既存追尾位置矩形情報に例えば０．３の重み、現在追尾位置矩形情報に例えば０．７の重みが乗じられる。これにより、追尾位置矩形情報から現在追尾位置矩形情報への滑らかな変化が得られる。 Next, it is determined whether or not the acquisition (or creation) of the current tracking position rectangle information is successful (SC4), and if successful, the object ID being tracked is stored in a predetermined table (SC5). Next, a position obtained by multiplying the current tracking position rectangular information by the weight is added to the existing tracking position rectangular information to create (update) new rectangular information (SC6). Also in this case, the existing tracking position rectangle information is multiplied by, for example, a weight of 0.3, and the current tracking position rectangle information is multiplied by, for example, a weight of 0.7. Thereby, a smooth change from the tracking position rectangular information to the current tracking position rectangular information is obtained.

次に追尾解除時間をリセットして終了する（ＳＣ８）。先のＳＣ４において、現在追尾位置矩形情報の取得ができなかった場合（ＳＣ４のＮｏ）、追尾解除経過時間を算出し（ＳＣ１１）、追尾解除時間が規定時間を超過しているかどうかを判定する（ＳＣ１２）。追尾解除時間が規定時間を超過している場合は、追尾状態を「追尾無し」にして（ＳＣ１３）、終了し、超過していない場合は、終了する。 Next, the tracking release time is reset and the process ends (SC8). In the previous SC4, if the current tracking position rectangle information could not be acquired (No in SC4), the tracking cancellation elapsed time is calculated (SC11), and it is determined whether the tracking cancellation time exceeds the specified time ( SC12). If the tracking release time exceeds the specified time, the tracking state is set to “no tracking” (SC13), and the process ends. If not, the process ends.

図９（ａ）、図９（ｂ）は、全方位高精細監視画像とパノラマ画像の説明図である。図９（ａ）は、画像バッファメモリ２０２に格納された全方位高精細監視画像２５０の一例を示している。この全方位高精細監視画像２５０は、部屋の空間を撮像したものであり、表示対象となる人物２５３を含む。例えば枠２５４で囲む矩形領域が注目領域である。この全方位高精細監視画像２５０は、パノラマ画像に変換される際、例えば分割線２５１で示す位置で上下に分割され、図９（ｂ）に示すように第１のパノラマ画像２６１、第２のパノラマ画像２６２として生成される。 FIGS. 9A and 9B are explanatory diagrams of an omnidirectional high-definition monitoring image and a panoramic image. FIG. 9A shows an example of the omnidirectional high-definition monitoring image 250 stored in the image buffer memory 202. The omnidirectional high-definition monitoring image 250 is an image of a room space and includes a person 253 to be displayed. For example, a rectangular area surrounded by a frame 254 is the attention area. When this omnidirectional high-definition monitoring image 250 is converted into a panoramic image, it is divided vertically, for example, at the position indicated by the dividing line 251, and as shown in FIG. 9B, the first panoramic image 261 and the second panoramic image 250 are divided. A panoramic image 262 is generated.

図１０は、本実施形態の監視画像処理装置により表示デバイス３２１に画像が表示された一例を示す図である。全体は、注目領域画像３０３であり顔画像３０１を含む。さらに、この注目領域画像３０３には、リサイズされた全方位監視画像３０２が挿入される。この全方位監視画像３０２の中に顔画像３０１に対応する人物画像３０２ａが存在する。 FIG. 10 is a diagram illustrating an example in which an image is displayed on the display device 321 by the monitoring image processing apparatus of the present embodiment. The whole is the attention area image 303 and includes the face image 301. Further, the resized omnidirectional monitoring image 302 is inserted into the attention area image 303. A person image 302 a corresponding to the face image 301 exists in the omnidirectional monitoring image 302.

この表示を行うことにより、注目領域の顔画像３０１の移動位置を、全方位監視画像３０２により容易に確認することができる。さらに全方位監視画像３０２に含まれる人物画像３０２ａには，フレーム３０２ｂが付加されてもよい。このフレーム３０２ｂは例えば注目画像（顔画像）を切出す際に利用された顔検知矩形座標情報が、画像バッファメモリ２０２に書き込まれることにより、生成されている。 By performing this display, the moving position of the face image 301 in the attention area can be easily confirmed by the omnidirectional monitoring image 302. Furthermore, a frame 302b may be added to the person image 302a included in the omnidirectional monitoring image 302. The frame 302b is generated by, for example, writing face detection rectangular coordinate information used when cutting out an attention image (face image) into the image buffer memory 202.

さらに、注目領域画像３０３には、属性情報３１２が表示されてもよい。またこの属性情報３１２と共にリサイズされた顔矩形画像３１１が表示されてもよい。この顔矩形画像３１１は、図２のオブジェクト情報バッファ２０６から取り出された画像である。 Further, attribute information 312 may be displayed in the attention area image 303. Further, the resized face rectangular image 311 may be displayed together with the attribute information 312. The face rectangular image 311 is an image extracted from the object information buffer 206 of FIG.

このような監視画像処理装置によると、注目領域画像３０３は、画像バッファメモリ２０２に貯えられた全方位高精細監視画像から、高精細な顔矩形画像として切出される。このために、パノラマ変換した画像から切り出すのに比べて画像品質が高い。またブロッキング輝度伸張処理されているので、画像品質の向上が図られている。 According to such a monitoring image processing apparatus, the attention area image 303 is cut out from the omnidirectional high-definition monitoring image stored in the image buffer memory 202 as a high-definition face rectangular image. For this reason, the image quality is higher than that of cutting out from the panorama converted image. Further, since the blocking luminance expansion processing is performed, the image quality is improved.

次に、注目領域画像３０３内にリサイズされた全方位監視画像３０２が挿入され、この全方位監視画像３０２の中に顔画像３０１に対応する人物画像３０２ａが存在する。このために、顔画像３０１の人物が監視空間の中で、どの位置に移動しているのかを容易に確認（監視）することができる。監視装置としての信頼性及び性能が向上している。また、属性情報も表示されるので監視機能の性能がアップする。 Next, the resized omnidirectional monitoring image 302 is inserted into the attention area image 303, and the person image 302 a corresponding to the face image 301 exists in the omnidirectional monitoring image 302. For this reason, it is possible to easily confirm (monitor) which position the person of the face image 301 has moved in the monitoring space. Reliability and performance as a monitoring device are improved. In addition, since the attribute information is also displayed, the performance of the monitoring function is improved.

なお表示する画像において、属性情報３１２、リサイズされた顔矩形画像３１１は、必ずしも表示する必要はない。 In the image to be displayed, the attribute information 312 and the resized face rectangular image 311 are not necessarily displayed.

図１１は、第１の画像処理部Ａ１における注目画像の追跡がロスト状態になったときに表示デバイス３２１に表示される画像の例を示している。今まで顔画像追跡中に表示されていた、注目領域画像３０３においては、例えば人物の顔画像にマスクがかけられる。そして第２の画像処理部Ａ２により、動体を追跡している様子が、全方位監視画像３０２の中に動体追跡枠３０２ｃ（つまりマーク）で示される。このとき動体の移動方向を示す矢印が表示されてもよい。 FIG. 11 shows an example of an image displayed on the display device 321 when the attention image tracking in the first image processing unit A1 is in a lost state. In the attention area image 303 that has been displayed during the face image tracking so far, for example, a face image of a person is masked. Then, a state in which the moving object is being tracked by the second image processing unit A2 is indicated by a moving object tracking frame 302c (that is, a mark) in the omnidirectional monitoring image 302. At this time, an arrow indicating the moving direction of the moving object may be displayed.

上記実施形態では、顔検知が得られない場合、対象人物らしき人物の動きを追跡した。しかし、属性情報（色、形状、付属物など）を確認しながら、対象人物らしき人物を追跡してもよいし、さらにはこの属性情報と動き情報とを組み合わせて、対象人物らしき人物を追跡してもよい。つまり動体が追跡されるときは、動き情報のみならず単一或いは複数の属性情報（洋服の色、帽子等）も併用されてもよい。このときは、追跡に利用している属性情報を点滅あるいは色を変化させて表示してもよい。 In the above embodiment, when face detection is not obtained, the movement of a person who seems to be a target person is tracked. However, the person who looks like the target person may be tracked while checking the attribute information (color, shape, accessory, etc.), and further, the person who seems to be the target person can be tracked by combining this attribute information and the motion information. May be. That is, when a moving object is tracked, not only motion information but also single or plural attribute information (cloth color, hat, etc.) may be used in combination. At this time, the attribute information used for tracking may be displayed by blinking or changing the color.

さらにまた、複数の人物の顔が検出されることもある。この場合、特定の人物（一人）を追跡する場合、属性情報を組み合わせて、追跡することが好ましい。例えば、特定の人物の属性情報が髪の毛が長い、赤い洋服を着ている情報であるとする。顔検知がロスト状態になったとき、動き検出側で、同じ属性情報をもつ動体を追跡すると、再度顔が検知されたとき、前記特定の人物であることの確率が高くなる。 Furthermore, a plurality of human faces may be detected. In this case, when tracking a specific person (one person), it is preferable to track by combining attribute information. For example, it is assumed that the attribute information of a specific person is information with long hair and wearing red clothes. When the face detection is lost and the moving object having the same attribute information is tracked on the motion detection side, the probability that the face is detected again increases when the face is detected again.

なお実施例では、個別特徴を顔とし、顔画像を追跡した。しかし、追跡対象は、顔に限らず、特定のマーク、国旗、特定の意匠、特定の発光をする物の場合も適用可能である。 In the embodiment, the individual feature is a face and the face image is tracked. However, the tracking target is not limited to a face, but can be applied to a specific mark, a national flag, a specific design, or a specific light emitting object.

実施形態では、３６０°全方位を撮像した画像を取得する装置として説明したが、１８０°以上の方位を撮像した画像を取得する装置であればよい。また、画像切出し、座標情報は、矩形である如く説明したが、これに限らず各種の形状であってもよい。例えば検出する人物が複数存在した場合、それぞれの顔画像の切り出し枠の形状が異なるようにしてもよい。 The embodiment has been described as an apparatus that acquires an image obtained by imaging 360 ° in all directions, but may be an apparatus that acquires an image obtained by imaging an orientation of 180 ° or more. Further, the image cutout and the coordinate information are described as being rectangular, but the present invention is not limited to this, and various shapes may be used. For example, when there are a plurality of persons to be detected, the shape of the cutout frame of each face image may be different.

本発明のいくつかの実施形態を説明したが、これらの実施形態は例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。さらにまた、請求項の各構成要素において、構成要素を分割して表現した場合、或いは複数を合わせて表現した場合、或いはこれらを組み合わせて表現した場合であっても本発明の範疇である。また請求項を制御ロジックとして表現した場合、コンピュータを実行させるインストラクションを含むプログラムとして表現した場合、及び前記インストラクションを記載したコンピュータ読み取り可能な記録媒体として表現した場合でも本発明の装置を適用したものである。 Although several embodiments of the present invention have been described, these embodiments have been presented by way of example and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof. Furthermore, in each constituent element of the claims, even when the constituent element is expressed in a divided manner, when a plurality of constituent elements are expressed together, or when they are expressed in combination, they are within the scope of the present invention. In addition, when the claims are expressed as control logic, when expressed as a program including instructions for causing a computer to execute, and when expressed as a computer-readable recording medium describing the instructions, the apparatus of the present invention is applied. is there.

Ａ１・・・第１の画像処理部、Ａ２・・・第２の画像処理部、Ａ３・・・注目画像出力部、１１、１２、１３、１４・・・リサイズ部、１０１・・・単眼カメラ（撮像部）、２００Ａ・・・データ処理装置、２０１・・・キャプチャ、２０２・・・画像バッファメモリ、２０３・・・パノラマ画像生成部、２０４・・・画像追跡処理部、２０５・・・オブジェクト情報バッファ、２０６・・・座標変換処理部、２０７・・・顔検知矩形座標情報部、３１１・・・画像追跡処理部、３１２・・・動体検知矩形座標情報部、３１４・・・オブジェクト追跡処理部、３１５・・・注目領域画像作成部、３１６・・・オブジェクト画像バッファメモリ、３２０・・・ブロッキング輝度伸張部、３２１・・・表示デバイス。 A1 ... first image processing unit, A2 ... second image processing unit, A3 ... target image output unit, 11, 12, 13, 14 ... resize unit, 101 ... monocular camera (Imaging unit), 200A ... data processing device, 201 ... capture, 202 ... image buffer memory, 203 ... panoramic image generation unit, 204 ... image tracking processing unit, 205 ... object Information buffer 206: Coordinate conversion processing unit 207 ... Face detection rectangular coordinate information unit 311 ... Image tracking processing unit 312 ... Moving object detection rectangular coordinate information unit 314 ... Object tracking processing 315... Attention area image creation unit, 316... Object image buffer memory, 320... Blocking luminance expansion unit, 321.

Claims

An apparatus comprising an imaging unit, a data processing device, and a display device, wherein the imaging unit inputs wide-area image data obtained by imaging a space including an object to the data processing device,
The data processing device includes:
A first image processing unit that tracks individual features (faces) of the object, a second image processing unit that tracks movement features of the objects, and a first attention including an image of the individual features (faces) A noticeable image output unit that cuts out an image (face image) and supplies it to the display device,
When the first image processing unit detects the image of the individual feature (face), the first image of interest is extracted from the wide area image data based on rectangular information surrounding the image of the individual feature (face). Generating first image area information for cutting out an image of (face image);
The second image processing unit tracks a second attention image (moving body image) based on the motion detection information of the image when the first image processing unit loses tracking of the image of the individual feature (face). And generating second image area information capable of cutting out the second attention image (moving body image) from the wide area image data,
While the first image processing unit is tracking the individual feature (face), the second image processing unit is configured to extract the second image for cutting out the second attention image (moving body image). As the area information, the second image of interest (moving body image) is virtually tracked based on the first image area information generated by the first image processing unit,
The attention image output unit cuts out the image of the individual feature (face) from the wide area image data based on at least the first image area information and supplies the image to the display device.
A monitoring image processing apparatus characterized by that.

An apparatus comprising an imaging unit, a data processing device, and a display device, wherein the imaging unit inputs wide-area image data obtained by imaging a space including an object to the data processing device,
The data processing device includes:
A first image processing unit that tracks individual features (faces) of the object, a second image processing unit that tracks movement features of the objects, and a first attention including an image of the individual features (faces) A noticeable image output unit that cuts out an image (face image) and supplies it to the display device,
When the first image processing unit detects the image of the individual feature (face), the first image of interest is extracted from the wide area image data based on rectangular information surrounding the image of the individual feature (face). Generating first image area information for cutting out an image of (face image);
The second image processing unit tracks a second attention image (moving body image) based on the motion detection information of the image when the first image processing unit loses tracking of the image of the individual feature (face). And generating second image area information capable of cutting out the second attention image (moving body image) from the wide area image data,
While the second image processing unit is tracking the motion feature, the first image processing unit uses the first image area information for cutting out a first attention image (face image) as the first image area information. The individual feature (face) is virtually tracked based on the second image area information generated by the second image processing unit,
The attention image output unit cuts out the image of the individual feature (face) from the wide area image data based on at least the first image area information and supplies the image to the display device.
A monitoring image processing apparatus characterized by that .

When the first image processing unit detects an image of the individual feature (face), the attention image output unit preferentially uses the first image area information.
The monitoring image processing apparatus according to claim 1 , wherein the monitoring image processing apparatus is a monitoring image processing apparatus.

When the first image processing unit loses the tracking of the individual feature (face), the second image processing unit uses the first image region information based on the first image region information. Generating the second image region information having a large region area and starting virtual tracking,
The monitoring image processing apparatus according to claim 1 , wherein the monitoring image processing apparatus is a monitoring image processing apparatus.

The data processing device includes:
The display device inputs and displays the first attention image (face image), and inputs and displays the image of the wide area image data resized on the first attention image (face image). ,
The monitoring image processing apparatus according to claim 1 , wherein the monitoring image processing apparatus is a monitoring image processing apparatus.

The data processing device includes:
The display device further inputs and displays attribute information related to the first image of interest (face image) .
The monitoring image processing apparatus according to claim 1 , wherein the monitoring image processing apparatus is a monitoring image processing apparatus.

The data processing device includes:
When the first image processing unit has lost tracking of the individual feature (face),
The display device inputs and displays the image of the wide area image data, and marks the second image of interest (moving body image) tracked by the second image processing unit as a mark. Display it over the image,
The monitoring image processing apparatus according to claim 1 , wherein the monitoring image processing apparatus is a monitoring image processing apparatus.

The imaging unit acquires a high-resolution wide-angle visual field monitoring image obtained by imaging a space with a visual angle of 180 ° or more,
The first image processing unit includes:
A first converting means for reducing the number of bits to a resolution lower than that of the wide-angle visual field monitoring image and resizing the first panoramic image and converting it into at least two, and the first panoramic image or Detection and tracking means for tracking at least the individual features from the second panoramic image, and coordinate data on the first panorama image or the second panoramic image of the individual features, the first on the wide-angle visual field monitoring image. Coordinate conversion means for converting the image area information into the first image area information.
The monitoring image processing apparatus according to claim 1 , wherein the monitoring image processing apparatus is a monitoring image processing apparatus.

A monitoring image processing method comprising an imaging unit, a data processing device, and a display device, wherein the imaging unit inputs wide-area image data obtained by imaging a space including an object to the data processing device,
The first image processing unit tracks individual features (faces) corresponding to the object, the second image processing unit tracks the movement features of the object, and the attention image output unit performs first attention image ( Cut out the face image) and supply it to the display device,
When the image of the individual feature (face) is detected, the first attention image (face image) is cut out from the wide area image data based on rectangular information surrounding the image of the individual feature (face) Generating first image region information;
When the first image processing unit loses tracking of the individual features (faces), the motion features are tracked based on the motion detection information of the images, and a second attention image (moving body image) is obtained from the wide area image data. Generating second image area information that can be cut out;
While the first image processing unit is tracking the individual feature (face), the second image processing unit is configured to extract the second image for cutting out the second attention image (moving body image). As the area information, the second image of interest (moving body image) is virtually tracked based on the first image area information generated by the first image processing unit,
Cutting out the first image of interest (face image) from the wide-area image data based on at least the first image area information and supplying it to the display device;
And a monitoring image processing method.

A monitoring image processing method comprising an imaging unit, a data processing device, and a display device, wherein the imaging unit inputs wide-area image data obtained by imaging a space including an object to the data processing device,
The first image processing unit tracks individual features (faces) corresponding to the object, the second image processing unit tracks the movement features of the object, and the attention image output unit performs first attention image ( Cut out the face image) and supply it to the display device,
When the image of the individual feature (face) is detected, the first attention image (face image) is cut out from the wide area image data based on rectangular information surrounding the image of the individual feature (face) Generating first image region information;
When the first image processing unit loses tracking of the individual features (faces), the motion features are tracked based on the motion detection information of the images, and a second attention image (moving body image) is obtained from the wide area image data. Generating second image area information that can be cut out;
While the second image processing unit is tracking the motion feature, the first image processing unit uses the first image area information for cutting out a first attention image (face image) as the first image area information. The individual feature (face) is virtually tracked based on the second image area information generated by the second image processing unit,
Cutting out the first image of interest (face image) from the wide-area image data based on at least the first image area information and supplying it to the display device;
And a monitoring image processing method.

When the image of the individual feature (face) is detected, the first image area information is preferentially used, and the first attention image (face image) is cut out from the wide area image data and supplied to the display device. To
The monitoring image processing method according to claim 9 or 10, wherein:

The imaging unit acquires a high-resolution wide-angle visual field monitoring image obtained by imaging a space with a visual angle of 180 ° or more,
The first image processing unit includes:
The number of bits is reduced and resized to a resolution lower than that of the wide-angle visual field monitoring image, and divided into at least two and converted into a first panoramic image and a second panoramic image,
Tracking at least the individual features from the first panoramic image or the second panoramic image;
Generating the first image area information on the wide-angle visual field monitoring image, the coordinate data on the first panoramic image or the second panoramic image of the individual feature,
The monitoring image processing method according to claim 9 or 10, wherein:

An imaging unit, a data processing device, and a display device, wherein the imaging unit includes a space including an object
An apparatus for inputting captured wide-area image data to the data processing apparatus,
The data processing device includes:
A first image processing unit that tracks individual features of the object, a second image processing unit that tracks movement features of the object, and an attention image that cuts out an image of the individual features and supplies the image to the display device With an output section,
The first image processing unit, when detecting the individual feature image, extracts the individual feature image from the wide area image data based on rectangular information surrounding the individual feature image. 1 image area information is generated,
The second image processing unit tracks a moving body image based on motion detection information of an image when the first image processing unit loses tracking of the image of the individual feature, and from the wide area image data, the Generating second image area information capable of cutting out the moving object image;
While the first image processing unit is tracking the individual feature, the second image processing unit uses the first image as the second image area information for cutting out the moving body image. Virtually tracking the moving object image based on the first image area information generated by the processing unit,
The attention image output unit cuts out the image of the individual feature from the wide area image data based on at least the first image area information and supplies the image to the display device .
A monitoring image processing apparatus characterized by that .

An imaging unit, a data processing device, and a display device, wherein the imaging unit includes a space including an object
An apparatus for inputting captured wide-area image data to the data processing apparatus,
The data processing device includes:
A first image processing unit that tracks individual features of the object, a second image processing unit that tracks movement features of the object, and an attention image that cuts out an image of the individual features and supplies the image to the display device With an output section,
The first image processing unit, when detecting the individual feature image, extracts the individual feature image from the wide area image data based on rectangular information surrounding the individual feature image. 1 image area information is generated,
The second image processing unit tracks a moving body image based on motion detection information of an image when the first image processing unit loses tracking of the image of the individual feature, and from the wide area image data, the Generating second image area information capable of cutting out the moving object image;
While the second image processing unit tracks the motion feature, the first image processing unit uses the second image processing unit as the first image area information for cutting out the image of the individual feature. Virtually tracking the individual features based on the second image area information generated by the image processing unit,
The attention image output unit cuts out the image of the individual feature from the wide area image data based on at least the first image area information and supplies the image to the display device.
A monitoring image processing apparatus characterized by that .

The imaging unit acquires a high-resolution wide-angle visual field monitoring image obtained by imaging a space with a visual angle of 180 ° or more,
The first image processing unit includes:
A first converting means for reducing the number of bits to a resolution lower than that of the wide-angle visual field monitoring image and resizing the first panoramic image and converting it into at least two, and the first panoramic image or Detection and tracking means for tracking at least the individual features from the second panoramic image, and coordinate data on the first panorama image or the second panoramic image of the individual features, the first on the wide-angle visual field monitoring image. Coordinate conversion means for converting the image area information into the first image area information.
The monitoring image processing apparatus according to claim 13 or 14, characterized in that: