JP2012129709A

JP2012129709A - Information processor, information processing method, and program

Info

Publication number: JP2012129709A
Application number: JP2010278089A
Authority: JP
Inventors: Hideo Noro; 英生野呂
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-12-14
Filing date: 2010-12-14
Publication date: 2012-07-05
Anticipated expiration: 2030-12-14
Also published as: JP5839796B2

Abstract

PROBLEM TO BE SOLVED: To correctly and stably capture a target object without adding a special configuration like a motion prediction part and without depending on the motion of the target object.SOLUTION: A low-resolution ROI (region of interest) determination part 1222 determines a first region of interest being the region of interest of low-resolution image data to be imaged at a succeeding time, based on image data imaged at a preceding time. An image input part 121 acquires low-resolution image data applicable to the first region of interest between the low-resolution image data from a camera 11. A high-resolution ROI determination part 1223 determines a second region of interest being the region of interest of high-resolution image data to be imaged at a succeeding time, based on the low-resolution image data applicable to the first region of interest.

Description

本発明は、追跡すべき対象物体を捉えるように画像データ内の注目領域を決定する技術に関するものである。 The present invention relates to a technique for determining a region of interest in image data so as to capture a target object to be tracked.

近年、ロボット技術が急速な進歩を遂げてきており、産業用ロボットのみならず、一般向け玩具としてもペットロボット等の開発が活発である。産業用組み立てロボットには、多品種少量生産が求められてきており、また組み立てる製品そのものも複雑化してきている。こうした状況に対応するためには、個々の製品に応じた特殊な専用工具や専用センサを極力少なくすることが必要となる。そこでカメラと多関節ロボットを組み合わせ、画像処理によって対象物体を認識し、組み立て作業を行うことに注目が集まってきている。また、ペットロボット等の一般向けロボットにおいては、使用可能なセンサが限られており、多くの情報を取得できるセンサとしてカメラを用い、画像処理により様々な情報を抽出、獲得しているものもある。 In recent years, robot technology has made rapid progress, and not only industrial robots but also pet robots are actively developed as toys for general use. Industrial assembly robots are required to produce a variety of products in small quantities, and the products to be assembled are becoming more complex. In order to cope with such a situation, it is necessary to reduce the number of special dedicated tools and dedicated sensors according to individual products as much as possible. Therefore, attention has been focused on combining cameras and articulated robots, recognizing target objects by image processing, and performing assembly work. Also, in general-purpose robots such as pet robots, usable sensors are limited, and there are some that use a camera as a sensor that can acquire a lot of information, and extract and acquire various information by image processing. .

このようなカメラと画像処理を組み合わせた視覚センサの最も基本的な働きの一つとして、対象物体の追跡（追尾）が挙げられる。カメラを用いた対象物体の追跡という技術に注目すると、いくつかの方式がある。一つは、カメラを固定しておき一定の範囲を撮像し続け、その撮像範囲内にある対象物体の追跡を行う、というものである。これは監視の分野で多く用いられ、不審者や異常物体の監視等に用いられる。また、人や車の交通量計測にも用いられる。最近のデジタルカメラには顔画像の認識と対象物体の追跡とを組み合わせ、常に人物の顔にピントを合わせるような製品もある。別の方式としては対象物体の移動に応じて、あるいはロボットの移動に応じて、対象物体を捉えつつカメラを移動させたり、向きを変えたりするものがある。対象物体の三次元形状計測や、位置姿勢認識等に用いられる。その他、両者を複合させた方式も考えられるが、いずれの方式にせよ、撮像された画像から対象物体の範囲を特定し、続く画像処理装置によって様々な処理がなされることとなる。対象物体の追跡処理は、次フレーム以降の撮像範囲の決定、または画像処理装置に送る画像範囲の決定、あるいはその両者ということになる。 One of the most basic functions of a visual sensor that combines such a camera and image processing is tracking (tracking) a target object. When paying attention to the technique of tracking a target object using a camera, there are several methods. One is to keep a camera fixed and continue to capture a certain range, and to track a target object within the imaging range. This is often used in the field of monitoring, and is used for monitoring suspicious persons and abnormal objects. It is also used for traffic volume measurement of people and cars. Some recent digital cameras combine facial image recognition and target object tracking to always focus on a person's face. Another method is to move the camera or change the direction while capturing the target object according to the movement of the target object or according to the movement of the robot. It is used for 3D shape measurement of target objects, position and orientation recognition, and the like. In addition, a method in which both are combined is conceivable, but in any method, the range of the target object is specified from the captured image, and various processes are performed by the subsequent image processing apparatus. The tracking processing of the target object is to determine the imaging range after the next frame and / or to determine the image range to be sent to the image processing apparatus.

ここで、従来の物体追跡撮像システムは、カメラと、カメラによって撮像された画像データから追跡すべき物体領域を抽出し、当該物体領域の画像データから注目領域（ＲＯＩ）を決定する物体追跡処理装置とによって構成される。物体追跡処理装置は、決定した注目領域に相当する画像データを生成し、生成した画像データを画像処理装置に対して渡す。画像処理装置では、注目領域に相当する画像データから、追跡すべき対象物体に刻印されている部品番号を読み取る処理や、ロボットが当該対象物体を把持する際のマニピュレータの挿入位置を決定するといった処理を行う。 Here, a conventional object tracking imaging system extracts an object region to be tracked from a camera and image data captured by the camera, and determines a region of interest (ROI) from the image data of the object region. It is comprised by. The object tracking processing device generates image data corresponding to the determined attention area, and passes the generated image data to the image processing device. In the image processing apparatus, a process of reading a part number stamped on a target object to be tracked from image data corresponding to a region of interest, or a process of determining an insertion position of a manipulator when the robot grips the target object I do.

このようにして、画像処理装置において処理される注目領域に相当する画像データが得られるが、カメラと物体追跡処理装置とをつなぐ通信路には一般に流れるデータ量の上限（帯域幅）が存在する。例えば、ＶＧＡ（640×480［画素］）サイズのＲＧＢカラー画像（各色８ビット）で毎秒６０画面を転送する場合であれば、６４０×４８０×８×３×６０＝４４２３６８０００［bit/sec］≒０．４［Gbit/sec］程度の幅があればよい。ＲＯＩの大きさを仮に撮像サイズの縦横１／４とすると、１６０×１２０［画素］である。しかし、アプリケーションによってはさらに高精細の画像を用いて画像処理を行いたい場合がある。例えば、ＶＧＡサイズのＲＯＩが必要であるとすると、撮像画像の大きさは２５６０×１９２０［画素］となる。この場合、通信路には７［Ｇｂｉｔ／ｓｅｃ］以上のトラフィックが発生する。フレームレート（毎秒転送する画面数）を上げれば、さらに広い帯域幅が必要になる。逆にいえば、通信路の帯域幅によって、画像処理装置が処理する画像データの画像サイズ（ＲＯＩサイズ）やフレームレートが抑えられてしまうことになる。しかし、画像処理装置が必要としている画像データのみをカメラから通信路に送出することができれば、通信路の帯域幅を有効に活用することができる。 In this way, image data corresponding to a region of interest processed in the image processing apparatus can be obtained. However, there is generally an upper limit (bandwidth) of the amount of data that flows in the communication path connecting the camera and the object tracking processing apparatus. . For example, in the case of transferring 60 screens per second with an RGB color image (8 bits for each color) of VGA (640 × 480 [pixel]) size, 640 × 480 × 8 × 3 × 60 = 4422368000 [bit / sec] ≈ A width of about 0.4 [Gbit / sec] is sufficient. Assuming that the ROI size is ¼ the horizontal and vertical size of the imaging size, it is 160 × 120 [pixels]. However, depending on the application, it may be desired to perform image processing using a higher definition image. For example, if a VGA-size ROI is required, the size of the captured image is 2560 × 1920 [pixels]. In this case, traffic of 7 [Gbit / sec] or more is generated on the communication path. Increasing the frame rate (number of screens transferred per second) will require wider bandwidth. In other words, the image size (ROI size) and the frame rate of the image data processed by the image processing apparatus are suppressed depending on the bandwidth of the communication path. However, if only the image data required by the image processing apparatus can be sent from the camera to the communication path, the bandwidth of the communication path can be used effectively.

特許文献１には、カメラ自体の位置や姿勢を制御することにより、ＲＯＩの画像データを取得する技術が開示されている。この場合、カメラの撮像範囲はＲＯＩサイズに一致している。 Patent Document 1 discloses a technique for acquiring ROI image data by controlling the position and orientation of the camera itself. In this case, the imaging range of the camera matches the ROI size.

ここでＲＯＩの決定について注目してみると、一画面前の画像データを基に決定している。そのため、追跡すべき対象物体がフレームサイズ、フレームレートと比較して高速に動いている場合等、対象物体を正確に捉えることが難しくなる。そこで、過去の対象物体の動きから、次画面での対象物体の位置を予測することが必要となる。 Here, paying attention to the determination of ROI, it is determined based on the image data of the previous screen. For this reason, it is difficult to accurately capture the target object when the target object to be tracked is moving faster than the frame size and frame rate. Therefore, it is necessary to predict the position of the target object on the next screen from the past movement of the target object.

特開２００５−５７７４３号公報JP-A-2005-57743

しかしながら、上述した過去の対象物体の動きから次画面での対象物体の位置を予測する手法では、フレームレートの高速化を図れたとしても、動き予測部が必要となるという課題がある。また、対象物体が、動き予測部が想定する動きとは異なる動きをした場合、対象物体を正確に、また安定して捉えることが難しいという課題もある。 However, the above-described method of predicting the position of the target object on the next screen from the past movement of the target object has a problem that a motion prediction unit is required even if the frame rate can be increased. In addition, when the target object moves differently from the movement assumed by the motion prediction unit, there is a problem that it is difficult to accurately and stably capture the target object.

そこで、本発明の目的は、動き予測部のような特別な構成を追加することなく、対象物体の動きによらず、対象物体を正確に、また安定して捉えることにある。 Therefore, an object of the present invention is to accurately and stably capture a target object regardless of the movement of the target object without adding a special configuration such as a motion prediction unit.

本発明の情報処理装置は、前の時点に撮像された画像データに基づいて、次の時点に撮像される低解像度画像データの注目領域である第１の注目領域を決定する第１の決定手段と、前記低解像度画像データのうち、前記第１の注目領域に該当する前記低解像度画像データを取得する取得手段と、前記第１の注目領域に該当する前記低解像度画像データに基づいて、前記次の時点に撮像される高解像度画像データの注目領域である第２の注目領域を決定する第２の決定手段とを有することを特徴とする。 The information processing apparatus according to the present invention is configured to determine a first attention area, which is an attention area of low-resolution image data captured at the next time point, based on image data captured at the previous time point. And, based on the low-resolution image data corresponding to the first region of interest, the acquisition means for acquiring the low-resolution image data corresponding to the first region of interest among the low-resolution image data, And second determination means for determining a second region of interest that is a region of interest of the high-resolution image data to be imaged at the next time point.

本発明によれば、動き予測部のような特別な構成を追加することなく、対象物体の動きによらず、対象物体を正確に、また安定して捉えることが可能となる。 According to the present invention, it is possible to accurately and stably capture a target object regardless of the movement of the target object without adding a special configuration such as a motion prediction unit.

本発明の実施形態に係る物体追跡撮像システムの構成を示す図である。It is a figure which shows the structure of the object tracking imaging system which concerns on embodiment of this invention. ＣＭＯＳイメージセンサ上における低解像度ＲＯＩ画像データを構成する画素と高解像度ＲＯＩ画像データを構成する画素との例を示す図である。It is a figure which shows the example of the pixel which comprises the low resolution ROI image data on a CMOS image sensor, and the pixel which comprises high resolution ROI image data. 対象物体が動いている場合等における時刻ｔの低解像度ＲＯＩと時刻ｔの高解像度ＲＯＩとの関係を示す図である。It is a figure which shows the relationship between the low resolution ROI of the time t, and the high resolution ROI of the time t, when the target object is moving. 本発明の実施形態における時刻ｔの低解像度ＲＯＩの決定方法を説明するための図である。It is a figure for demonstrating the determination method of the low resolution ROI of the time t in embodiment of this invention. 画像中心を求める手順を示すフローチャートである。It is a flowchart which shows the procedure which calculates | requires an image center. 時刻ｔ−１と時刻ｔにおける対象物体を撮像した画像データを示す図である。It is a figure which shows the image data which imaged the target object in the time t-1 and the time t. 本発明の実施形態に係る物体追跡撮影システムと比較するための物体追跡撮影システムの例を示す図である。It is a figure which shows the example of the object tracking imaging system for comparing with the object tracking imaging system which concerns on embodiment of this invention. 図７の比較例におけるＲＯＩ決定方法を説明するための図である。It is a figure for demonstrating the ROI determination method in the comparative example of FIG.

以下、本発明を適用した好適な実施形態を、添付図面を参照しながら詳細に説明する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments to which the invention is applied will be described in detail with reference to the accompanying drawings.

図１は、本発明の実施形態に係る物体追跡撮像システムの構成を示す図である。図１に示すように、本実施形態に係る物体追跡撮像システムは、カメラ１１と物体追跡処理装置１２とが通信部１３を介して接続されている。カメラ１１は、撮像部１１１、画像出力部１１２、撮像制御部１１３及び解像度・ＲＯＩ制御部１１４から構成される。なお、物体追跡処理装置は、本発明の情報処理装置の適用例となる構成である。 FIG. 1 is a diagram showing a configuration of an object tracking imaging system according to an embodiment of the present invention. As shown in FIG. 1, in the object tracking imaging system according to the present embodiment, a camera 11 and an object tracking processing device 12 are connected via a communication unit 13. The camera 11 includes an imaging unit 111, an image output unit 112, an imaging control unit 113, and a resolution / ROI control unit 114. The object tracking processing device is a configuration that is an application example of the information processing device of the present invention.

撮像部１１１は、レンズ、ＣＭＯＳイメージセンサ等から構成され、撮像処理により画像データを生成する。画像出力部１１２は、撮像部１１１により生成された画像データを、通信路１３を介して物体追跡処理装置１２に送信する。撮像制御部１１３は、例えば露光タイミングやＣＭＯＳイメージセンサ内に保持された画像データを消去するタイミングを制御する。例えば撮像制御部１１３は、一定時間毎に露光を行ったり、画像出力部１１２が通信路１３に画像データを出力した後に、ＣＭＯＳイメージセンサ内に保持された画像データを消去したりする。解像度・ＲＯＩ制御部１１４は、物体追跡処理装置１２から解像度情報及びＲＯＩ情報の２つの情報を得て、画像出力部１１２に対して出力する。ここでいう解像度情報とは、低解像度画像データを指定するための情報や高解像度画像データを指定するための情報である。本実施形態においては、同一の撮像範囲の画像データで比べた場合、高解像度画像データでは１２８０×９６０画素、低解像度画像データでは６４０×４８０画素であることを想定している。また、ＲＯＩ情報とは、撮影された画像データ内の注目領域（ＲＯＩ：Region of Interest）を指定する情報である。ＲＯＩとしては、例えば、物体領域を含む最小の矩形領域、或いは物体領域の中心を画像中心とする予め定められた大きさの矩形領域等である。 The imaging unit 111 includes a lens, a CMOS image sensor, and the like, and generates image data by imaging processing. The image output unit 112 transmits the image data generated by the imaging unit 111 to the object tracking processing device 12 via the communication path 13. The imaging control unit 113 controls, for example, exposure timing and timing for erasing image data held in the CMOS image sensor. For example, the imaging control unit 113 performs exposure at regular time intervals, or erases the image data held in the CMOS image sensor after the image output unit 112 outputs the image data to the communication path 13. The resolution / ROI control unit 114 obtains two pieces of information of resolution information and ROI information from the object tracking processing device 12 and outputs them to the image output unit 112. The resolution information here is information for designating low resolution image data or information for designating high resolution image data. In the present embodiment, it is assumed that the image data in the same imaging range is 1280 × 960 pixels for high resolution image data and 640 × 480 pixels for low resolution image data. The ROI information is information that designates a region of interest (ROI) in captured image data. The ROI is, for example, a minimum rectangular area including the object area, or a rectangular area having a predetermined size with the center of the object area as the image center.

物体追跡処理装置１２は、画像入力部１２１、ＲＯＩ画像生成部１２３及びＲＯＩ算出部１２２から構成される。ＲＯＩ算出部１２２は、対象抽出部１２２１、低解像度ＲＯＩ決定部１２２２及び高解像度ＲＯＩ決定部１２２３から構成される。物体追跡処理装置１２は、通信路１３から画像入力部１２１によって画像データを得る。画像入力部１２１は、通信路１３から得た画像データを、物体追跡処理装置１２内の他の構成部からアクセスできる状態で保持する。なお、低解像度ＲＯＩ決定部１２２２は、第１の決定手段の適用例となる構成であり、画像入力部１２１は、取得手段の適用例となる構成であり、高解像度ＲＯＩ決定部１２２３は、第２の決定手段の適用例となる構成である。 The object tracking processing device 12 includes an image input unit 121, an ROI image generation unit 123, and an ROI calculation unit 122. The ROI calculation unit 122 includes a target extraction unit 1221, a low resolution ROI determination unit 1222, and a high resolution ROI determination unit 1223. The object tracking processing device 12 obtains image data from the communication path 13 by the image input unit 121. The image input unit 121 holds the image data obtained from the communication path 13 in a state where it can be accessed from other components in the object tracking processing device 12. The low-resolution ROI determination unit 1222 is a configuration that is an application example of the first determination unit, the image input unit 121 is a configuration that is an application example of the acquisition unit, and the high-resolution ROI determination unit 1223 is a first example. This is a configuration that is an application example of the second determination means.

以下では、説明の便宜上、撮像時刻を、・・・、ｔ−２、ｔ−１、ｔ、ｔ＋１、・・・であるとし、現在は時刻ｔでの撮像が終了した直後であるとする。対象抽出部１２２１は、画像入力部１２１から低解像度ＲＯＩ画像データ又は高解像度ＲＯＩ画像データを得て、追跡すべき物体領域を抽出する。低解像度ＲＯＩ決定部１２２２は、低解像度ＲＯＩ画像データから抽出された物体領域からＲＯＩ（以下、低解像度ＲＯＩと称す）を決定する。また、高解像度ＲＯＩ決定部１２２３は、高解像度ＲＯＩ画像データから抽出された物体領域からＲＯＩ（以下、高解像度ＲＯＩと称す）を決定する。なお、画像入力部１２１においては、時刻ｔ−１までの低解像度ＲＯＩ画像データ及び高解像度ＲＯＩ画像データを、通信路１３を経てカメラ１１より取得済みである。 In the following, for convenience of explanation, it is assumed that the imaging times are..., T-2, t-1, t, t + 1,. The target extraction unit 1221 obtains low resolution ROI image data or high resolution ROI image data from the image input unit 121, and extracts an object region to be tracked. The low resolution ROI determination unit 1222 determines an ROI (hereinafter referred to as a low resolution ROI) from an object region extracted from the low resolution ROI image data. The high resolution ROI determination unit 1223 determines an ROI (hereinafter referred to as a high resolution ROI) from an object region extracted from the high resolution ROI image data. Note that the image input unit 121 has already acquired the low-resolution ROI image data and the high-resolution ROI image data from the camera 11 through the communication path 13 until time t−1.

さて、ここで時刻ｔのＲＯＩ画像データ（低解像度ＲＯＩ画像データ、高解像度ＲＯＩ画像データ）を得る手法について説明する。対象抽出部１２２１は、時刻ｔ−１の低解像度ＲＯＩ画像データから物体領域を抽出する。物体領域の抽出方法は、例えば黒い背景内を白いピンポン玉様の物体が動き回っているときに、ピンポン玉様の物体領域を抽出する場合の手法の例は次のようなものである。先ず対象抽出部１２２１は、低解像度ＲＯＩ画像データから輝度画像データを得る。ＲＧＢカラー画像データから輝度値Ｙを算出する手法はよく知られており、画素毎にＹ＝０．２９９Ｒ＋０．５８７Ｇ＋０．１１４Ｂで算出された輝度値から輝度画像データを生成する方法がよく用いられる。輝度画像データの画素値が[０，１]で正規化されているとすると、ある閾値(例えば、０．５)以下の輝度値を持つ画素と、それ以外の画素とで二値化する。前者を黒画素、後者を白画素として二値画像データを生成すると、ピンポン玉様の物体領域は白画素で表される。そこで、全白画素を含む最小の円形領域を以て物体領域の抽出ができたことになる。 Now, a method for obtaining ROI image data (low resolution ROI image data, high resolution ROI image data) at time t will be described. The target extraction unit 1221 extracts an object region from the low-resolution ROI image data at time t-1. As an object region extraction method, for example, a method for extracting a ping-pong ball-like object region when a white ping-pong ball-like object is moving around in a black background is as follows. First, the target extraction unit 1221 obtains luminance image data from the low resolution ROI image data. A method of calculating the luminance value Y from the RGB color image data is well known, and a method of generating luminance image data from the luminance value calculated by Y = 0.299R + 0.587G + 0.114B for each pixel is often used. Assuming that the pixel value of the luminance image data is normalized by [0, 1], binarization is performed with a pixel having a luminance value of a certain threshold value (for example, 0.5) or less and other pixels. When binary image data is generated with the former being a black pixel and the latter being a white pixel, the ping-pong ball-like object region is represented by a white pixel. Therefore, the object region can be extracted with the smallest circular region including all white pixels.

対象抽出部１２２１は、抽出した物体領域の位置や大きさ等の情報を低解像度ＲＯＩ決定部１２２２に送る。対象抽出部１２２１は、前述の手法で物体領域を抽出した場合、例えば当該物体領域の中心を「位置」、円形領域の直径を「大きさ」とする。 The target extraction unit 1221 sends information such as the position and size of the extracted object region to the low resolution ROI determination unit 1222. When the object extraction unit 1221 extracts the object region by the above-described method, for example, the center of the object region is set to “position” and the diameter of the circular region is set to “size”.

低解像度ＲＯＩ決定部１２２２は、時刻ｔ−１の低解像度ＲＯＩ画像データ中の物体領域の「位置」及び「大きさ」に基づいて、時刻ｔの低解像度ＲＯＩ（第１の注目領域）を決定する。対象物体の運動に比べてフレームレートが十分高い（撮像間隔が十分に短い）場合、物体領域の「位置」や「大きさ」は、時刻ｔ−１のＲＯＩ画像データと時刻ｔのＲＯＩ画像データとでほとんど同じと考えられる。そこで、低解像度ＲＯＩ決定部１２２２は、例えば物体領域を含む最小の矩形領域を時刻ｔの低解像度ＲＯＩとして決定する。「位置」及び「大きさ」を用いた場合、例えば「位置」を中心とする一辺の長さが「大きさ」である正方形領域を低解像度ＲＯＩとすればよい。そして、低解像度ＲＯＩ決定部１２２２は、時刻ｔの低解像度ＲＯＩを示すＲＯＩ情報と解像度情報（ここでは、低解像度画像データを指定するための情報）との２つの情報をカメラ１１の解像度・ＲＯＩ制御部１１４に通知する。 The low resolution ROI determination unit 1222 determines the low resolution ROI (first attention area) at time t based on the “position” and “size” of the object area in the low resolution ROI image data at time t−1. To do. When the frame rate is sufficiently higher than the motion of the target object (the imaging interval is sufficiently short), the “position” and “size” of the object region are the ROI image data at time t−1 and the ROI image data at time t. It is almost the same. Therefore, the low resolution ROI determination unit 1222 determines, for example, the minimum rectangular area including the object area as the low resolution ROI at time t. When “position” and “size” are used, for example, a square region whose side is “size” centered on “position” may be set as the low resolution ROI. Then, the low resolution ROI determination unit 1222 uses two pieces of information, ROI information indicating the low resolution ROI at time t and resolution information (in this case, information for designating low resolution image data) as the resolution / ROI of the camera 11. Notify the control unit 114.

解像度・ＲＯＩ制御部１１４は、物体追跡処理装置１２から得たＲＯＩ情報と解像度情報とを画像出力部１１２に伝える。画像出力部１２２は、解像度情報に応じて、撮像部１１１が撮像した時刻ｔの低解像度画像データを取得する。そして画像出力部１２２は、ＲＯＩ情報によって指定された低解像度ＲＯＩの範囲を時刻ｔの低解像度画像データから抽出し、時刻ｔの低解像度ＲＯＩ画像データとして通信路１３を経て物体追跡処理部１２に送出する。 The resolution / ROI control unit 114 transmits the ROI information and resolution information obtained from the object tracking processing device 12 to the image output unit 112. The image output unit 122 acquires low-resolution image data at time t captured by the imaging unit 111 according to the resolution information. Then, the image output unit 122 extracts the range of the low resolution ROI specified by the ROI information from the low resolution image data at the time t, and sends it to the object tracking processing unit 12 via the communication path 13 as the low resolution ROI image data at the time t. Send it out.

画像入力部１２１は、この時刻ｔの低解像度ＲＯＩ画像データを得て、対象抽出部１２２１に送る。対象抽出部１２２１は、時刻ｔの低解像度ＲＯＩ画像データから物体領域を抽出する。そして対象抽出部１２２１は、抽出した物体領域の「位置」や「大きさ」等の情報を高解像度ＲＯＩ決定部１２２３に送る。一般的に低解像度ＲＯＩに相当する高解像度画像データの領域の中に高解像度ＲＯＩが含まれる。そこで高解像度ＲＯＩ決定部１２２３は、時刻ｔの低解像度ＲＯＩ画像データ中の物体領域の「位置」や「大きさ」等の情報に基づいて時刻ｔの高解像度ＲＯＩ（第２の注目領域）を決定する。ここでは、例えば物体領域を含む最小の矩形領域を高解像度ＲＯＩとして決定するものとする。 The image input unit 121 obtains the low-resolution ROI image data at this time t and sends it to the target extraction unit 1221. The target extraction unit 1221 extracts an object region from the low-resolution ROI image data at time t. Then, the target extraction unit 1221 sends information such as “position” and “size” of the extracted object region to the high-resolution ROI determination unit 1223. Generally, a high resolution ROI is included in a region of high resolution image data corresponding to the low resolution ROI. Therefore, the high resolution ROI determination unit 1223 determines the high resolution ROI (second attention area) at time t based on information such as “position” and “size” of the object area in the low resolution ROI image data at time t. decide. Here, for example, the smallest rectangular area including the object area is determined as the high resolution ROI.

以上のように、時刻ｔの低解像度ＲＯＩの決定は、時刻ｔ−１の低解像度ＲＯＩ画像データを基にしているため、対象物体が運動している場合には物体領域の中心が必ずしも時刻ｔ−１の低解像度ＲＯＩ画像データの中心にあるとは限らない。しかし、時刻ｔの高解像度ＲＯＩの決定には、同時刻である時刻ｔの低解像度ＲＯＩ画像データを基にしているため、対象物体の中心と高解像度ＲＯＩ画像データの中心とを一致させることができる。このように、二段階に高解像度ＲＯＩを決定することにより、対象物体の運動によらずに安定して高解像度ＲＯＩを決定することができる。 As described above, since the determination of the low-resolution ROI at time t is based on the low-resolution ROI image data at time t−1, the center of the object region is not necessarily the time t when the target object is moving. The low-resolution ROI image data of −1 is not necessarily at the center. However, since the determination of the high resolution ROI at time t is based on the low resolution ROI image data at time t, which is the same time, the center of the target object may coincide with the center of the high resolution ROI image data. it can. Thus, by determining the high resolution ROI in two steps, the high resolution ROI can be determined stably regardless of the motion of the target object.

そして、高解像度ＲＯＩ決定部１２２３は、決定した高解像度ＲＯＩを示すＲＯＩ情報と解像度情報（ここでは、高解像度画像データを指定するための情報）との２つの情報をカメラ１１の解像度・ＲＯＩ制御部１１４に通知する。解像度・ＲＯＩ制御部１１４は、物体追跡処理装置１２から得たＲＯＩ情報と解像度情報とを画像出力部１１２に伝える。画像出力部１１２は、解像度情報に応じて、撮像部１１１が撮像した時刻ｔの高解像度画像データを取得する。なお、以下の説明において、低解像度画像データ及び高解像度画像データとは、ともに同一の撮像範囲で撮像された解像度が異なる画像データである。画像出力部１１２は、ＲＯＩ情報によって指定された高解像度ＲＯＩの範囲を時刻ｔの高解像度画像データから抽出し、時刻ｔの高解像度ＲＯＩ画像データとして通信路１３を経て物体追跡処理装置１２に送出する。 Then, the high-resolution ROI determination unit 1223 uses two pieces of information, ROI information indicating the determined high-resolution ROI and resolution information (here, information for designating high-resolution image data) as resolution / ROI control of the camera 11. Notification to the unit 114. The resolution / ROI control unit 114 transmits the ROI information and resolution information obtained from the object tracking processing device 12 to the image output unit 112. The image output unit 112 acquires high-resolution image data at time t captured by the imaging unit 111 according to the resolution information. In the following description, both low resolution image data and high resolution image data are image data having different resolutions captured in the same imaging range. The image output unit 112 extracts the range of the high resolution ROI specified by the ROI information from the high resolution image data at the time t, and sends it to the object tracking processing device 12 via the communication path 13 as the high resolution ROI image data at the time t. To do.

画像入力部１２１は、時刻ｔの高解像度ＲＯＩ画像データを得て、ＲＯＩ画像生成部１２３に送る。ＲＯＩ画像生成部１２３は、時刻ｔの高解像度ＲＯＩ画像データを物体追跡処理装置１２の外部に配置される画像処理装置１４に送る。画像処理装置１４は、時刻ｔの高解像度ＲＯＩ画像データを用いて対象物体に関する処理を実行する。ここでいう対象物体に関する処理とは、例えば、当該対象物体に刻印されている部品番号を読み取る処理や、ロボットが当該対象物体を把持する際のマニピュレータの挿入位置の決定といった処理である。 The image input unit 121 obtains high-resolution ROI image data at time t and sends it to the ROI image generation unit 123. The ROI image generation unit 123 sends the high-resolution ROI image data at time t to the image processing device 14 arranged outside the object tracking processing device 12. The image processing device 14 executes processing related to the target object using the high-resolution ROI image data at time t. The process related to the target object here is, for example, a process of reading a part number stamped on the target object or a process of determining a manipulator insertion position when the robot holds the target object.

画像出力部１１２は、時刻ｔの高解像度ＲＯＩ画像データを送出した後に、時刻ｔの画像データの消去を撮像制御部１１３に対して指示するとともに、時刻ｔ＋１での撮像を撮像制御部１１３に対して指示する。撮像制御部１１３は、画像出力部１１２からの指示に従って撮像部１１１を制御する。 After sending the high-resolution ROI image data at time t, the image output unit 112 instructs the imaging control unit 113 to erase the image data at time t, and performs imaging at time t + 1 to the imaging control unit 113. Instruct. The imaging control unit 113 controls the imaging unit 111 in accordance with instructions from the image output unit 112.

以上の動作を繰り返し、時刻ｔ＋１の低解像度ＲＯＩ画像データ、時刻ｔ＋１の高解像度ＲＯＩ画像データ、・・・と順にＲＯＩ画像データを得ていくことにより、画像処理装置１４には各時刻の高解像度ＲＯＩ画像データが連続して送られることになる。 By repeating the above operation and obtaining ROI image data in the order of low resolution ROI image data at time t + 1, high resolution ROI image data at time t + 1,... ROI image data is sent continuously.

なお、一般的なグローバルシャッタタイプ（全画素の露光を同時に行う方式）のＣＭＯＳイメージセンサの場合、デジタル化された画像データはＣＭＯＳイメージセンサ上にあり、非破壊で読み出すことができる。そのため、高解像度ＲＯＩ画像データは、撮像部１１１がＣＭＯＳイメージセンサ上の高解像度ＲＯＩとして指示された領域の全画素を読み出すことにより実現できる。また、低解像度ＲＯＩ画像データは、撮像部１１１がＣＭＯＳイメージセンサ上の低解像度ＲＯＩとして指示された領域の画素を例えば１画素ずつ間引いて読み出すことにより実現できる。 In the case of a general global shutter type (a method in which all pixels are exposed simultaneously) CMOS image sensor, the digitized image data is on the CMOS image sensor and can be read out nondestructively. Therefore, the high-resolution ROI image data can be realized by reading out all the pixels in the area designated as the high-resolution ROI on the CMOS image sensor by the imaging unit 111. Further, the low resolution ROI image data can be realized by reading out the pixels in the area designated as the low resolution ROI on the CMOS image sensor by thinning out pixels one by one, for example.

ところで、本実施形態においては、低解像度ＲＯＩと高解像度ＲＯＩとはほぼ同じ領域である。そのため、低解像度ＲＯＩ画像データに含まれる画素は、そのほとんどが高解像度ＲＯＩ画像データに含まれることになる。そこで、画像出力部１１２は、時刻ｔの高解像度ＲＯＩ画像データを転送する際、時刻ｔの低解像度ＲＯＩ画像データに含まれない画素データのみを転送する。これにより、通信路１３を流れる画像データ量を本来必要な高解像度ＲＯＩ画像データのデータ量とほぼ等しくすることができる。これは、一画素ずつ間引いて低解像度ＲＯＩ画像データを読み出して物体追跡処理部１２に転送することによる通信路１３の圧迫は、ほとんどないことを意味する。 By the way, in this embodiment, the low resolution ROI and the high resolution ROI are substantially the same region. Therefore, most of the pixels included in the low resolution ROI image data are included in the high resolution ROI image data. Therefore, when transferring the high-resolution ROI image data at time t, the image output unit 112 transfers only pixel data not included in the low-resolution ROI image data at time t. As a result, the amount of image data flowing through the communication path 13 can be made substantially equal to the amount of high-resolution ROI image data originally required. This means that there is almost no pressure on the communication path 13 by thinning out pixels one by one and reading out the low resolution ROI image data and transferring it to the object tracking processing unit 12.

図２は、ＣＭＯＳイメージセンサ上における低解像度ＲＯＩ画像データを構成する画素と高解像度ＲＯＩ画像データを構成する画素との例を示す図である。図２の例において、低解像度ＲＯＩ画像データを構成する画素は○と●であり、高解像度ＲＯＩ画像データを構成する画素は■と●である。先ず画像出力部１１２は、○と●との画素から低解像度ＲＯＩ画像データを読み出し、物体追跡処理装置１２に転送する。その後、画像出力部１１２は、■と●との画素から高解像度画像ＲＯＩデータを読み出し、物体追跡処理装置１２に転送する。但し、既に●は転送済みであるので、画像出力部１１２は■だけを転送する。図２の場合、低解像度ＲＯＩ画像データと高解像度ＲＯＩ画像データとの重なりは約６５％であるが、低解像度ＲＯＩ画像データの転送によるデータ量（画素数）の増加分は９％以下である。 FIG. 2 is a diagram showing an example of pixels constituting low resolution ROI image data and pixels constituting high resolution ROI image data on a CMOS image sensor. In the example of FIG. 2, the pixels constituting the low resolution ROI image data are ◯ and ●, and the pixels constituting the high resolution ROI image data are ■ and ●. First, the image output unit 112 reads out low-resolution ROI image data from the pixels ◯ and ● and transfers it to the object tracking processing device 12. Thereafter, the image output unit 112 reads the high-resolution image ROI data from the pixels {circle around (1)} and ●, and transfers it to the object tracking processing device 12. However, since ● has already been transferred, the image output unit 112 transfers only ■. In the case of FIG. 2, the overlap between the low resolution ROI image data and the high resolution ROI image data is about 65%, but the increase in the amount of data (number of pixels) due to the transfer of the low resolution ROI image data is 9% or less. .

なお、本実施形態においては、説明を分かりやすくするため、時刻ｔの低解像度ＲＯＩ画像データを得てから、時刻ｔの高解像度ＲＯＩ画像データ、次いで時刻ｔ＋１の低解像度ＲＯＩ画像データを得るように説明しているが、処理は逐次的でなくても構わない。時刻ｔの低解像度ＲＯＩ画像データを得て、対象抽出部１２２１で物体領域の「位置」や「大きさ」等の情報を得た後、低解像度ＲＯＩ決定部１２２２と高解像度ＲＯＩ決定部１２２３とで同時に処理を進めても構わない。但し、画像出力部１１２による時刻ｔの低解像度ＲＯＩ画像データの送出は、時刻ｔの高解像度ＲＯＩ画像データの送出に先立って行われなければならない。従って、ＲＯＩ情報及び解像度情報の送出時、あるいは解像度・ＲＯＩ制御部１１４で上記指示情報を受信した後に正しい順序になるようにする必要がある。 In this embodiment, in order to make the explanation easy to understand, after obtaining the low-resolution ROI image data at time t, the high-resolution ROI image data at time t and then the low-resolution ROI image data at time t + 1 are obtained. Although described, the process may not be sequential. After obtaining low-resolution ROI image data at time t and obtaining information such as “position” and “size” of the object region in the target extraction unit 1221, the low-resolution ROI determination unit 1222 and the high-resolution ROI determination unit 1223 You may proceed at the same time. However, the transmission of the low resolution ROI image data at time t by the image output unit 112 must be performed prior to the transmission of the high resolution ROI image data at time t. Therefore, it is necessary to make the order correct when sending the ROI information and the resolution information, or after receiving the instruction information by the resolution / ROI control unit 114.

また、本実施形態では、物体領域を含む最小の矩形領域をＲＯＩとしているが、他の方法でも構わない。以下にいくつかの手法の例を挙げる。
第１の手法として、物体領域の中心を画像中心とする予め定められた大きさの矩形領域をＲＯＩとして決定する手法である。第２の手法として、物体領域の中心を画像中心とし、物体領域を含む最小の矩形領域をＲＯＩとして決定する手法である。第３の手法として、物体領域を含む最小の矩形領域の中心を画像中心とし、物体領域を含む最小の矩形領域の大きさに一定倍率をかけた大きさの矩形領域をＲＯＩとして決定する手法である。第４の手法として、第２又は第３の手法の矩形領域の中心を画像中心とし、当該矩形領域を含む、予め定められた縦横比の最小の矩形領域をＲＯＩとして決定する手法である。 In this embodiment, the minimum rectangular area including the object area is set as the ROI, but other methods may be used. Some examples of techniques are given below.
As a first method, a rectangular region having a predetermined size with the center of the object region as the image center is determined as the ROI. As a second method, the center of the object region is set as the image center, and the minimum rectangular region including the object region is determined as the ROI. As a third method, the center of the smallest rectangular area including the object area is set as the image center, and a rectangular area having a size obtained by multiplying the size of the smallest rectangular area including the object area by a fixed magnification is determined as the ROI. is there. As a fourth technique, the center of the rectangular area of the second or third technique is used as the image center, and the rectangular area having the minimum predetermined aspect ratio including the rectangular area is determined as the ROI.

次に、本発明の第２の実施形態について説明する。第１の実施形態で示したように、低解像度ＲＯＩ決定部１２２２は、時刻ｔの低解像度ＲＯＩを決定する際に、時刻ｔ−１の低解像度ＲＯＩ画像データを基にしている。そのため、対象物体が動いている場合等には、時刻ｔの低解像度ＲＯＩは必ずしも時刻ｔの高解像度ＲＯＩを含んでいない。これは、高解像度ＲＯＩ決定部１２２３が時刻ｔの高解像度ＲＯＩを決定する際に、未知の領域を一部含めなければならないことを意味している。 Next, a second embodiment of the present invention will be described. As shown in the first embodiment, the low resolution ROI determination unit 1222 is based on the low resolution ROI image data at time t−1 when determining the low resolution ROI at time t. Therefore, when the target object is moving, the low resolution ROI at time t does not necessarily include the high resolution ROI at time t. This means that when the high resolution ROI determination unit 1223 determines the high resolution ROI at time t, a part of the unknown area must be included.

図３は、対象物体が動いている場合等における時刻ｔの低解像度ＲＯＩと時刻ｔの高解像度ＲＯＩとの関係を示す図である。図３に示すように、対象物体が動いている場合等において、時刻ｔの低解像度ＲＯＩ画像データに基づいて時刻ｔの高解像度ＲＯＩを決定する場合、時刻ｔの高解像度ＲＯＩの一部に、時刻ｔの低解像度ＲＯＩ画像データに属しない領域が存在することになる。従って、時刻ｔの高解像度ＲＯＩを決定する際には、当該時刻ｔの高解像度ＲＯＩの一部について物体領域を推定する必要がある。 FIG. 3 is a diagram illustrating the relationship between the low resolution ROI at time t and the high resolution ROI at time t when the target object is moving. As shown in FIG. 3, when the high resolution ROI at the time t is determined based on the low resolution ROI image data at the time t when the target object is moving, a part of the high resolution ROI at the time t is There is an area that does not belong to the low-resolution ROI image data at time t. Therefore, when determining the high resolution ROI at time t, it is necessary to estimate the object region for a part of the high resolution ROI at time t.

そこで、時刻ｔの高解像度ＲＯＩ画像データに含まれる可能性のある領域を全て、時刻ｔの低解像度ＲＯＩ画像データの領域が含むようにしてやると、時刻ｔの高解像度ＲＯＩの決定は単なる切り出しという問題に帰着する。具体的には、対象物体が撮影間隔の間に画面上で動き得る最大の距離だけ低解像度ＲＯＩを大きめに決定すればよい。 Therefore, if all the regions that may be included in the high-resolution ROI image data at time t are included in the region of the low-resolution ROI image data at time t, the determination of the high-resolution ROI at time t is simply clipping. It comes down to the problem. Specifically, the low resolution ROI may be determined to be larger by the maximum distance at which the target object can move on the screen during the shooting interval.

図４は、本実施形態における時刻ｔの低解像度ＲＯＩの決定方法を説明するための図である。図４に示すように、時刻ｔの高解像度ＲＯＩに対して上記最大の距離（マージン）だけ大きめに時刻ｔの低解像度ＲＯＩを決定する。これにより、時刻ｔの高解像度ＲＯＩは必ず時刻ｔの低解像度ＲＯＩ画像データに含まれることになり、高解像度ＲＯＩの決定が正確且つ安定して行えるようになる。なお、上記最大の距離の算出方法は、動的に定めてもよいが、システムの動作要件として予め定めておいてもよい。 FIG. 4 is a diagram for explaining a method of determining the low resolution ROI at time t in the present embodiment. As shown in FIG. 4, the low resolution ROI at time t is determined to be larger by the maximum distance (margin) than the high resolution ROI at time t. As a result, the high resolution ROI at time t is always included in the low resolution ROI image data at time t, and the determination of the high resolution ROI can be performed accurately and stably. The maximum distance calculation method may be determined dynamically, but may be determined in advance as system operation requirements.

例えば、左右の画角が９０°の撮像系で１ｍ離れた面上を対象物体が動き回る場合、水平方向の画素数が１０２４画素である場合には、１画素あたり約１．５ｍｍの解像度を持っていることになる。毎秒１０００フレームで撮像を行う場合、連続するフレーム間で２０画素のマージン（最大の距離）を見込むと、２０[画素]×１．５[ｍｍ／画素]×１０００[フレーム／秒]＝３０００[ｍｍ／秒]となり、秒速３ｍまでの対象物体の追跡が可能ということになる。 For example, when the target object moves around on a surface 1 m away in an imaging system with a left and right angle of view of 90 °, and the number of pixels in the horizontal direction is 1024 pixels, it has a resolution of about 1.5 mm per pixel. Will be. When imaging is performed at 1000 frames per second, when 20 pixels margin (maximum distance) is expected between consecutive frames, 20 [pixel] × 1.5 [mm / pixel] × 1000 [frame / second] = 3000 [ mm / sec], and it is possible to track the target object up to 3 m / s.

次に、本発明の第３の実施形態について説明する。上記実施形態では、高解像度ＲＯＩ決定部４２２２は、時刻ｔの高解像度ＲＯＩを時刻ｔの低解像度ＲＯＩ画像データから決定していた。低解像度ＲＯＩ画像データにより、物体領域の大まかな「位置」や「大きさ」は分かるが、詳細な「位置」及び「大きさ」は解像度が不足しているために容易には算出できない。例えば、蜘蛛の足のように細い尖状の突起があるような物体の場合、低解像度ＲＯＩ画像データからは突起の位置や姿勢が明瞭に判断しにくい場合がある。このような場合には、高解像度ＲＯＩ画像データも併用して用いるとよい。大まかな「位置」及び「大きさ」の算出には低解像度ＲＯＩ画像データを用い、詳細な「位置」及び「大きさ」の算出には高解像度ＲＯＩ画像データを用いるのである。具体的には、対象抽出部１２２１は、時刻ｔの低解像度ＲＯＩ画像データから物体領域の大まかな「位置」及び「大きさ」を算出し、さらに時刻ｔ−１の高解像度ＲＯＩ画像データを用いて詳細な「位置」及び「大きさ」を算出する。 Next, a third embodiment of the present invention will be described. In the above embodiment, the high resolution ROI determination unit 4222 determines the high resolution ROI at time t from the low resolution ROI image data at time t. Although the rough “position” and “size” of the object region can be understood from the low-resolution ROI image data, the detailed “position” and “size” cannot be easily calculated because the resolution is insufficient. For example, in the case of an object having a thin pointed protrusion such as a heel foot, it may be difficult to clearly determine the position and orientation of the protrusion from the low-resolution ROI image data. In such a case, high resolution ROI image data may be used in combination. Low-resolution ROI image data is used for the rough calculation of “position” and “size”, and high-resolution ROI image data is used for the detailed calculation of “position” and “size”. Specifically, the target extraction unit 1221 calculates a rough “position” and “size” of the object region from the low-resolution ROI image data at time t, and further uses the high-resolution ROI image data at time t−1. The detailed “position” and “size” are calculated.

高解像度ＲＯＩの大きさは一定として、画像中心を求める手順の例を図５に示す。また、対象物体（蜘蛛）を図６に示す。図６の左側が時刻ｔ−１、右側が時刻ｔの撮像画像である。図６に示すように、対象物体は上から下へ移動しており、また低解像度画像データでは検出の難しい突起があり、この突起も運動をしている。 FIG. 5 shows an example of a procedure for obtaining the center of the image, assuming that the size of the high resolution ROI is constant. The target object (物体) is shown in FIG. The left side of FIG. 6 is a captured image at time t−1, and the right side is a captured image at time t. As shown in FIG. 6, the target object is moving from top to bottom, and there are projections that are difficult to detect in the low-resolution image data, and these projections also move.

本実施形態の場合、低解像度ＲＯＩ画像データと高解像度ＲＯＩ画像データとの両方の画像データが高解像度ＲＯＩ決定部１２２３に渡される。時刻ｔの低解像度ＲＯＩ画像データが入力されると、高解像度ＲＯＩ決定部１２２３は、上記実施形態と同様に、図６の６０１の○内に十字で示している対象物体の中心を算出する（ステップＳ５０１）。そして高解像度ＲＯＩ決定部１２２３は、得られた対象物体の中心と、別途保存している対象物体特有の特徴毎のオフセット情報とから、対象物体の中心を補正する（ステップＳ５０２）。ここで対象物体特有の特徴とは、低解像度ＲＯＩ画像データからは抽出が困難であるが、高解像度ＲＯＩ画像データからは抽出可能な特徴であり、図６では蜘蛛の脚の先端（図６の６０２における○）である。オフセット情報とは、低解像度ＲＯＩ画像データから得られた中心位置と対象物体特有の特徴の相対位置関係を示すものであり、時刻ｔ−１における情報が保存されている。高フレームレートで連続撮像するため、時刻ｔ−１での相対位置関係と、時刻ｔでの相対位置関係はほぼ等しいとみなせる。こうして得られたオフセット情報を用いて対象物体特有の特徴を推定することにより、対象物体の大きさ及び中心を推定（補正）する。 In the case of the present embodiment, both low-resolution ROI image data and high-resolution ROI image data are passed to the high-resolution ROI determination unit 1223. When the low-resolution ROI image data at time t is input, the high-resolution ROI determination unit 1223 calculates the center of the target object indicated by a cross in the circle 601 in FIG. Step S501). Then, the high-resolution ROI determination unit 1223 corrects the center of the target object from the obtained center of the target object and the offset information for each feature specific to the target object stored separately (step S502). Here, the feature unique to the target object is a feature that is difficult to extract from the low-resolution ROI image data but can be extracted from the high-resolution ROI image data. ◯ in 602). The offset information indicates the relative positional relationship between the center position obtained from the low-resolution ROI image data and the feature specific to the target object, and information at time t−1 is stored. Since continuous imaging is performed at a high frame rate, the relative positional relationship at time t-1 and the relative positional relationship at time t can be regarded as substantially equal. By using the offset information obtained in this way to estimate the characteristics specific to the target object, the size and center of the target object are estimated (corrected).

図６の右側に、上記手順で推定した高解像度ＲＯＩに対応する領域を一点鎖線で示した。高解像度ＲＯＩ決定部１２２３は、対象物体の中心位置を補正した後、補正前の低解像度ＲＯＩ画像データから得られた対象物体の中心位置を、低解像度中心位置情報として更新する（ステップＳ５０３）。そして、高解像度ＲＯＩ決定部１２２３は、補正後の対象物体の中心（高解像度ＲＯＩ）を、高解像度ＲＯＩ決定部１２２３の出力とする（ステップＳ５０４）。 On the right side of FIG. 6, a region corresponding to the high resolution ROI estimated by the above procedure is indicated by a one-dot chain line. The high resolution ROI determination unit 1223 corrects the center position of the target object, and then updates the center position of the target object obtained from the low resolution ROI image data before correction as the low resolution center position information (step S503). Then, the high resolution ROI determination unit 1223 uses the corrected center of the target object (high resolution ROI) as the output of the high resolution ROI determination unit 1223 (step S504).

さて、高解像度ＲＯＩ画像データが入力された場合には、高解像度ＲＯＩ決定部１２２３は相対位置関係である特徴毎のオフセット情報を更新する。時刻ｔの高解像度ＲＯＩ画像データが入力された時点では、低解像度中心位置情報には時刻ｔの低解像度ＲＯＩ画像データから得られた対象の中心位置が保存されている。従って、高解像度ＲＯＩ決定部１２２３は、この低解像度中心位置情報と特徴毎のオフセット情報とから、高解像度ＲＯＩ画像データ上での特徴位置を推定する（ステップＳ５０５）。前述したとおり、高フレームレートで連続撮影するため、ここで推定した特徴位置付近に実際の特徴が存在する。そこで、高解像度ＲＯＩ決定部１２２３は、推定した位置付近で対象物体の特徴位置を、画像パターンマッチング等を用いて検出する（ステップＳ５０６）。高解像度ＲＯＩ決定部１２２３は、検出された特徴位置と低解像度中心位置情報とから、特徴毎のオフセット情報を算出して更新する（ステップＳ５０７）。 When high-resolution ROI image data is input, the high-resolution ROI determination unit 1223 updates offset information for each feature having a relative positional relationship. When the high-resolution ROI image data at time t is input, the center position of the target obtained from the low-resolution ROI image data at time t is stored in the low-resolution center position information. Therefore, the high resolution ROI determination unit 1223 estimates the feature position on the high resolution ROI image data from the low resolution center position information and the offset information for each feature (step S505). As described above, since continuous shooting is performed at a high frame rate, an actual feature exists in the vicinity of the feature position estimated here. Therefore, the high resolution ROI determination unit 1223 detects the feature position of the target object in the vicinity of the estimated position using image pattern matching or the like (step S506). The high resolution ROI determination unit 1223 calculates and updates offset information for each feature from the detected feature position and low resolution center position information (step S507).

具体的には次のようにして推定する。先ず高解像度ＲＯＩ決定部１２２３は対象特有の特徴の詳細な位置を特定する。時刻ｔ−１における対象物体（蜘蛛）の腹部の中心位置は、時刻ｔの低解像度ＲＯＩ画像データから推定できる。ここから、高解像度ＲＯＩ決定部１２２３は上記オフセット情報を用いて、対象特有の特徴がありそうな範囲を特定する。そして、高解像度ＲＯＩ決定部１２２３は、この範囲の中で対象物体特有の特徴を検出し、詳細な位置を特定する。対象物体特有の特徴の検出には、例えば蜘蛛の足の特徴的な画像パターンをいくつか用意しておき、パターンマッチングにより検出を行う。高解像度ＲＯＩ決定部１２２３は、対象特有の特徴の詳細な位置を特定した後、上記オフセット情報を時刻ｔのものとして更新しておく。次に高解像度ＲＯＩ決定部１２２３は、蜘蛛の全ての足先（対象特有の特徴）を含む最小の矩形領域を求め、その中心と大きさを対象物体の中心と大きさとして算出する。 Specifically, the estimation is performed as follows. First, the high resolution ROI determination unit 1223 specifies the detailed position of the feature unique to the object. The center position of the abdomen of the target object (蜘蛛) at time t−1 can be estimated from the low-resolution ROI image data at time t. From here, the high-resolution ROI determination unit 1223 uses the offset information to identify a range that is likely to have object-specific features. Then, the high resolution ROI determination unit 1223 detects a characteristic specific to the target object within this range and specifies a detailed position. For the detection of the characteristic peculiar to the target object, for example, several characteristic image patterns of the heel leg are prepared, and detection is performed by pattern matching. The high-resolution ROI determination unit 1223 specifies the detailed position of the object-specific feature and then updates the offset information as that at time t. Next, the high-resolution ROI determination unit 1223 obtains the minimum rectangular area including all the toes (target-specific features) of the heel, and calculates the center and size as the center and size of the target object.

なお、ここでは高解像度ＲＯＩ画像データの中心を求める例を示したが、他の値を求めてもよい。高解像度ＲＯＩを構成する値（画像サイズ、姿勢等）、或いはそれを求めるために必要な値であればよい。 In addition, although the example which calculates | requires the center of high resolution ROI image data was shown here, you may obtain | require another value. Any value may be used as long as it is a value (image size, orientation, etc.) constituting the high-resolution ROI or a value necessary for obtaining it.

また、ここでは時刻ｔの高解像度ＲＯＩを求めるのに、時刻ｔ−１及び時刻ｔの低解像度ＲＯＩ画像データ、時刻ｔ−１の高解像度ＲＯＩ画像データを用いたが、さらに過去の低解像度ＲＯＩ画像や高解像度ＲＯＩ画像を用いてもよい。 In this example, the high resolution ROI at time t is obtained by using the low resolution ROI image data at time t-1 and time t and the high resolution ROI image data at time t-1. An image or a high-resolution ROI image may be used.

図７は、本発明の実施形態に係る物体追跡撮影システムと比較するための物体追跡撮影システムの例を示す図である。図７において、図１と同じ符号の構成は、図１と共通の構成であるため、それらの説明は省略する。図１との相違点としては、カメラ７１１にはＲＯＩ制御部１１５が設けられ、物体追跡処理装置７１２には、図１と共通の対象抽出部１２２１のほか、動き予測部１２２４及びＲＯＩ決定部１２２５を備えたＲＯＩ算出部７２２を設けた点にある。 FIG. 7 is a diagram illustrating an example of an object tracking imaging system for comparison with the object tracking imaging system according to the embodiment of the present invention. 7, the configuration with the same reference numerals as in FIG. 1 is the same configuration as in FIG. The difference from FIG. 1 is that the camera 711 is provided with the ROI control unit 115, and the object tracking processing device 712 includes the motion prediction unit 1224 and the ROI determination unit 1225 in addition to the target extraction unit 1221 common to FIG. The point is that an ROI calculation unit 722 provided with the above is provided.

撮像部１１１が撮像する範囲は、上述した実施形態と同様であるが、画像出力部１１２から通信路１３を経て物体追跡処理装置７１２に送られるのは、ＲＯＩ制御部１１５から指示されたＲＯＩの範囲のみである。 The range captured by the imaging unit 111 is the same as that in the above-described embodiment. However, what is sent from the image output unit 112 to the object tracking processing device 712 via the communication path 13 is the ROI instructed from the ROI control unit 115. It is only a range.

物体追跡処理装置７１２に送られたＲＯＩ画像データは、既に画像処理装置１４に必要なサイズになっているので、そのまま画像処理装置１４に送るだけでよい。対象抽出部１２２１は、ＲＯＩ画像データから追跡すべき対象物体を抽出し、ＲＯＩ決定部１２２５は、抽出された対象物体に基づいて次の画像データのＲＯＩを決定し、ＲＯＩ制御部１１５に対して通知する。ここで時刻ｔのＲＯＩの決定は、時刻ｔ−１の画像データを基に決定されている（図８（ａ））。そのため、図８（ｂ）に示すように、対象物体がフレームサイズ、フレームレートと比較して高速に動いている場合等、対象物体を正確に捉えることが難しくなる。そこで、過去の対象物体の動きから、時刻ｔでの対象物体の位置を予測することが必要となる。動き予測部１２２４は対象物体の動きを予測する。ＲＯＩ決定部１２２５は、動き予測部１２２４による予測結果に基づいて時刻ｔのＲＯＩを決定する。これにより、図８（ｂ）の実線枠で示したように対象物体を捉えることが可能となる。 Since the ROI image data sent to the object tracking processing device 712 is already in a size necessary for the image processing device 14, it is only necessary to send it to the image processing device 14 as it is. The target extraction unit 1221 extracts a target object to be tracked from the ROI image data, and the ROI determination unit 1225 determines the ROI of the next image data based on the extracted target object, and the ROI control unit 115 Notice. Here, the ROI at time t is determined based on the image data at time t-1 (FIG. 8A). Therefore, as shown in FIG. 8B, it is difficult to accurately capture the target object when the target object is moving at a higher speed than the frame size and the frame rate. Therefore, it is necessary to predict the position of the target object at time t from the past movement of the target object. The motion prediction unit 1224 predicts the motion of the target object. The ROI determination unit 1225 determines the ROI at time t based on the prediction result from the motion prediction unit 1224. As a result, the target object can be captured as shown by the solid line frame in FIG.

しかしながら、図７に示す物体追跡撮影システムにおいては、フレームレートの高速化を図れたとしても、動き予測部１２２４という構成の追加が必要となる。また、対象物体が、動き予測部１２２４が想定する動きとは異なる動きをした場合、対象物体を正確に、また安定して捉えることが難しい。これに対し、本発明の実施形態は、動き予測部１２２４のような特別な構成を追加することなく、対象物体の動きによらず、対象物体を正確に、また安定して捉えることが可能となる。 However, in the object tracking imaging system shown in FIG. 7, even if the frame rate can be increased, the configuration of the motion prediction unit 1224 is required. In addition, when the target object moves differently from the motion assumed by the motion prediction unit 1224, it is difficult to accurately and stably capture the target object. On the other hand, the embodiment of the present invention can capture the target object accurately and stably without adding a special configuration such as the motion prediction unit 1224, regardless of the motion of the target object. Become.

また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

１１：カメラ、１２：物体追跡処理装置、１３：通信路、１１１：撮像部、１１２：画像出力部、１１３：撮像制御部、１１４：解像度・ＲＯＩ制御部、１２１：画像入力部、１２２：ＲＯＩ算出部、１２３：ＲＯＩ画像生成部、１２４：画像処理装置、１２２１：対象抽出部、１２２２：低解像度ＲＯＩ決定部、１２２３：高解像度ＲＯＩ決定部 11: Camera, 12: Object tracking processing device, 13: Communication path, 111: Imaging unit, 112: Image output unit, 113: Imaging control unit, 114: Resolution / ROI control unit, 121: Image input unit, 122: ROI Calculation unit, 123: ROI image generation unit, 124: Image processing device, 1221: Object extraction unit, 1222: Low resolution ROI determination unit, 1223: High resolution ROI determination unit

Claims

First determination means for determining a first region of interest, which is a region of interest of low-resolution image data captured at the next time point, based on image data captured at the previous time point;
Obtaining means for obtaining the low resolution image data corresponding to the first region of interest among the low resolution image data;
Second determining means for determining a second region of interest, which is a region of interest of the high-resolution image data captured at the next time point, based on the low-resolution image data corresponding to the first region of interest. An information processing apparatus comprising:

The information processing apparatus according to claim 1, wherein the first determination unit determines the first region of interest so as to include a region that may be included in the second region of interest.

The second determining unit determines the second region of interest based on the high resolution image data captured at the previous time point in addition to the low resolution image data corresponding to the first region of interest. The information processing apparatus according to claim 1 or 2.

An information processing method executed by an information processing apparatus,
A first determination step of determining a first region of interest, which is a region of interest of low-resolution image data captured at the next time point, based on image data captured at the previous time point;
Obtaining the low resolution image data corresponding to the first region of interest among the low resolution image data; and
A second determination step of determining a second region of interest that is a region of interest of the high-resolution image data captured at the next time point based on the low-resolution image data corresponding to the first region of interest; An information processing method comprising:

A first determination step of determining a first region of interest, which is a region of interest of low-resolution image data captured at the next time point, based on image data captured at the previous time point;
Obtaining the low resolution image data corresponding to the first region of interest among the low resolution image data; and
A second determination step of determining a second region of interest that is a region of interest of the high-resolution image data captured at the next time point based on the low-resolution image data corresponding to the first region of interest; A program that causes a computer to execute.