WO2005088962A1

WO2005088962A1 - Tracking device and motion capture device

Info

Publication number: WO2005088962A1
Application number: PCT/JP2005/004176
Authority: WO
Inventors: Hiroshi Arisawa; Kazunori Sakaki
Original assignee: Hiroshi Arisawa; Kazunori Sakaki
Priority date: 2004-03-16
Filing date: 2005-03-10
Publication date: 2005-09-22

Abstract

[PROBLEMS] To track a moving object automatically and display the object on a screen of a tracking imaging camera, where the display is on a large scale and at the center of the screen. [MEANS FOR SOLVING PROBLEMS] A tracking device is a device for making a tracking camera track an object. The tracking device has a range finder (2) for measuring in a predetermined range the distance and angle from the position where an object is fixed, a fixed camera (3) for acquiring image data in the predetermined range, object information forming means for forming object information determining an object region, the forming made by using the distance data and angle data acquired by the range finder and the image data acquired by the fixed camera, and camera parameter forming means for forming camera parameter of the tracking camera by using the object information and the distance data. The range finder and the fixed camera are arranged in a fixed manner in the predetermined range.

Description

Specification

Tracking device and motion capture device

Technical field

TECHNICAL FIELD [0001] The present invention relates to a motion photographer, and more particularly to a tracking device for automatically tracking an object and a motion photographer provided with the tracking device.

Background art

In various fields such as industry, medicine, and sports, it has been attempted to take objects in the real world onto a computer and perform various processes on the computer. For example, information on the movement of a person or object or the shape of an object is acquired and used for analysis of movement of a person or object, formation of a virtual space, and the like.

However, because people and objects that want to be evaluated actually work under various environments, they are not necessarily suitable places for acquiring such information. In addition, in order to capture events that are being performed in the real world directly on a computer, it is necessary that work be performed without causing time to be taken by objects such as people or objects and their surrounding environment.

Heretofore, as a method of capturing such an object in the real world on a computer, a method called motion canopy has been known. This motion capture simulates the movement of a person or other moving body.

For example, mechanical, magnetic or optical motion cameras are known as motion cameras. In mechanical motion cameras, an angle detector or pressure sensor is attached to the performer's body, and the movement of the performer is detected by detecting the bending angle of the joints. A magnetic sensor exists by attaching a magnetic sensor to each part of the person's body, moving the performer in an artificially generated magnetic field, and detecting the density and angle of magnetic field lines with the magnetic sensor. The position is derived to detect the actor's movement.

Further, in the case of an optical motion camera, a marker is attached to a place where it is desired to measure the movement of the performer's body, and this marker is imaged by a camera to measure the movement of each part of the marker. [0007] In any method, it is necessary to attach a detector, sensor or marker to the subject, which is a burden on the subject. Even in the case of an optical motion capture camera that can obtain high accuracy, it is necessary to mount several tens of markers in order to obtain the movement of the entire human body, and its application is limited.

[0008] On the other hand, there is also proposed a motion capture that is not burdensome for the subject. This motion capture captures the movement of the human body in a non-contact manner by using the images of a multi-viewpoint camera and taking correspondence with a virtual three-dimensional human body model. Non-patent documents 1 and 2 are examples of methods for matching such multi-viewpoint video with a model. In Non-Patent Document 1, a posture is estimated by overlaying and evaluating a three-dimensional model and a silhouette image in which only a subject is cut out in each image. In Non-Patent Document 2, the difference between the current image and the next image is obtained, and the difference is used to determine the posture.

Conventional motion cameras have a problem that the range for capturing the motion of the subject is narrow, and a wide-area motion camera capable of capturing a subject in a wide range is required.

In this wide area motion camera, a system has been proposed in which an object is captured by a plurality of cameras. In this wide-range motion camera, a configuration has been proposed that measures sensor data (such as pan, tilt, and furkas) of each camera accurately, and stores sensor data of multiple cameras and uncompressed images in synchronization. . In this configuration, when shooting an object, spatial constraints are reduced by changing the camera shake and zoom.

[0011] In order to perform operations such as camera shake and zoom change with a plurality of cameras in a wide area motion camera while being powerful, more people than the number of cameras are required. The need for multiple camera operators is a major burden in achieving wide area motion capture.

An automatic tracking system can be considered as a method for reducing the burden on camera operation. In particular, in automatic tracking used for wide area motion cameras, it is required that subject information be stably and accurately grasped, and data suitable for camera control be obtained.

In the wide area motion camera, the subject is as large as possible on the screen of the imaging camera. It is often required to be projected to the center of the screen. In addition, since the accuracy of motion acquisition is determined by the image obtained from the imaging camera to be tracked, the control as accurate as possible is required.

In addition, at the time of tracking, it is impossible to impose new burdens such as attaching markers and sensors to the subject.

FIG. 15 is a view for explaining an object image projected on a screen in a wide area motion camera.

If only tracking is performed so that the subject is in the imaging range, the distance between the subject and the camera changes with the movement of the subject, so for example, as shown in FIG. As shown in (c), the image may be projected at a position out of the center of the screen, the size of the image may be too small with respect to the screen, or only a part of the subject may be missing.

In the wide area motion camera, regardless of the distance between the subject and the camera, the image of the subject is displayed at the center of the screen as shown in (b) and (d), and the motion of the subject is captured. It is required that the desired part be projected as large as possible.

Conventionally, as an automatic tracking system, processing is not performed while the camera is operating, and frame differences are taken while the camera is stopped, an object with motion is extracted, and the camera is controlled so that the object is at the center of the screen. There is a method, but with this method, subject information can not be acquired unless the camera is stopped, so there is a problem that it takes time to acquire information and tracking can not be performed. In the case of images, if the subject is not moving, it can not be detected, and the control of the zoom is performed only with the size of the detected subject area.

On the other hand, as a method capable of performing inter-frame subtraction even during operation of the tracking camera, pan and tilt amounts are calculated by measuring the motion of the camera's pan or calculating the image strength motion vector. A method for moving the image by the amount has been proposed (Non-Patent Documents 3 and 4) o This method controls that the zoom processing is based on the size of the area where there is a difference, and the movement of the subject is not !, If it can not be detected, there is a problem in terms of the ability to follow the speed, movement In addition, as a method of extracting the subject even when the subject is stationary while the camera is in operation, the background of the camera movement range is acquired in advance, and the pan and tilt amount points are cut out of the background area at that time. It has been proposed that the difference processing is performed on the image (Non-Patent Documents 5 and 6).

Also in this method, there is a problem that the subject loses the screen power or can not be tracked when moving fast, and the zoom control is also control with the size of the area, so the accuracy is not good. There is a problem of

[0022] A method has also been proposed that uses human skin color to control a subject area on an image so that that area is the center of the screen (Non-Patent Document 7). Although it can be used under conditions, the part that can be grasped is a limited part such as the face and hands, so it is not suitable for photographing the whole body.

In addition, the applicant has previously filed a patent application relating to a motion cap (see Patent Documents 1 and 2).

[0024] Non-Patent Document 1: Journal of the Institute of Electronics, Information and Communication Engineers D-II Vol. J 82-D-II No. 10 pp. 17 39-1749 October 1999 "Determination of posture of a person by movement and formation model"

Non-Patent Document 2: Transactions of the Institute of Electronics, Information and Communication Engineers D-II Vol. J80-D-II No. 6 pp. 158 1-1589 June 1997

Non Patent Literature 3: Masahiko Horiguchi, Yoshinori Takeuchi, Noboru Onishi "Smooth Pan-Tilt Moving Object Tracking", IEICE Technical Report IE, Image Engineering Vol. 99 Num. 610 pp. 43-48 (2000.02)

Non Patent Literature 4: Toshihiko Morita "Detection and tracking of motion by local correlation operation", Technical report of IEICE Technical Conference PRUM, Pattern recognition 'Media understanding' Vol. 100 Num. 44 pp. 55-62 (2000.05)

Non-Patent Document 5: Naoya Hada, Tetsuo Miyake "Tracking of Moving Objects with Active Vision System", The Journal of the Institute of Electronics, Information and Communication Engineers D-II Vol. J84-D-II No. (2001.1)

Non Patent Literature 6: Takashi Matsuyama, Toshikazu Wada, Tobu Obabe "Real-time target detection 'tracking using fixed viewpoint pan-tilt-zoom (<Special feature> Image recognition'understanding)", IPSJ Journal Vol. 40 No. 8 pp. 3169-3178 (1999. 08) Non-Patent Document 7: Kazuyuki Mitsuka, Keiichi Yamamura, Tokui Yamanaka "Development of Intelligence Robot Camera", Technical Report of the Institute of Television Engineers of Japan Vol. 17 No. 51 pp. 33- 37 (1993.09)

Patent Document 1: Japanese Patent Application No. 2002-379536

Patent Literature 2: # 112003-116631

Disclosure of the invention

Problem that invention tries to solve

The conventional method is to track the subject only with the information obtained by the tracking camera. The obtained information is limited to the position of the center of gravity of the subject area on the image and the size of the area. Since the subject area has a bias, even if the area force also determines the center of gravity, the subject is not centered on the actual screen, and tracking can not be performed if the subject is off the screen. There is a big problem as a tracking system.

As described above, in the conventionally proposed motion camera, there is a demand for automatically tracking the moving subject and enlarging the subject on the screen of the imaging camera to be tracked and for projecting the subject in the center of the screen. The ones that satisfy the above are not known to the extent of the inventor.

Therefore, the present invention solves the above-mentioned conventional problems, and aims to automatically track a moving subject, and projects the subject largely on the screen of the imaging camera to be tracked, and in the center of the screen. The purpose is to project.

Means to solve the problem

The present invention uses a range finder and a fixed camera to obtain a two-dimensional absolute position of a subject from distance data and angle data that can also be obtained from a range finder, and the size of the subject using image data obtained from fixed camera power. By acquiring the center of the subject and combining them, a tracking imaging camera (hereinafter referred to as a tracking camera) forms a force melanometer for tracking the subject.

In the present invention, by obtaining the two-dimensional position of the subject, the size of the subject, and the center of the subject, the tracking camera is controlled in accordance with the movement of the subject by controlling the pan and tilt of the tracking camera. At the same time, the subject is photographed in the center of the screen displayed by the tracking camera, and the zoom of the tracking camera is controlled to make the subject large on the screen, and the focus of the tracking camera is controlled. Focus on.

The present invention can be embodied as a tracking device, a tracking method, and a motion capture device including the tracking device.

The form of the tracking device of the present invention is a tracking device for causing a tracking camera to track an object, and a range finder for measuring the distance and angle from a fixed position of an object within a predetermined area, and A subject camera for forming subject information for determining a subject area by using a fixed camera for obtaining image data of the area, and distance data and angle data obtained by the range finder and image data obtained by the fixed camera; Each means of the camera parameter formation means for forming the camera parameters of the tracking camera using object information and distance data is provided, and the range finder and the fixed camera are fixed and arranged in a predetermined area.

The range finder is a measuring means for detecting the distance to the subject and the angle, and for example, a laser-one range finder can be used. The laser range finder is a device that scans in the horizontal direction while changing the irradiation angle of the laser and measures the distance and angle to the subject by detecting the reflection of the subject's force. By fixing the range finder at a predetermined position, it is possible to detect a two-dimensional absolute position of the subject in a predetermined area. In addition, the fixed camera captures an image of a predetermined area and acquires image data of the area.

By fixing the range finder and the fixed camera with respect to a predetermined area, it is possible to associate the positional relationship between data from which both measurement means can be obtained. For example, fixing the range finder and the fixed camera at the same position makes it easy to associate the distance and angle data obtained by the range finder with the image data obtained by the fixed camera.

Image data captured by the fixed camera includes subject data and background data. Background data is image data which does not change regardless of the movement of the subject within a predetermined area. On the other hand, subject data changes as the subject moves within a predetermined area.

According to the present invention, in order to know the position and size of a subject, a subject area where the subject is present is determined from image data captured by a fixed camera, and subject information of the position and size of the subject is acquired from this subject area. Do. The subject information forming means uses background data from image data. The subject area is acquired from the difference image data obtained by subtracting the data.

Here, when the background data is subtracted from the total image data force captured by the fixed camera, the processing time takes a long time. Therefore, according to the present invention, high-speed processing is performed by performing difference processing only on the candidate area formed by the candidate area forming means. The candidate area forming means forms a candidate area for performing image difference out of the whole area using the distance data and angle data acquired by the range finder.

The present invention also includes a differencer that subtracts background data from image data. The differencer is a hardware configuration that acquires background image processing that subtracts image data and background data only in the candidate area to acquire difference image data, and is not processing by software, so that difference processing can be performed with high-speed processing. . The subtractor may have a circuit configuration that performs, for each pixel, differential processing between image data sequentially input from the fixed camera and background data acquired in advance. The differential image data can be output at the same speed as the image data input from the fixed camera by sequentially differential processing the image data input serially as the fixed camera power for each pixel, and for real time tracking. It is possible to form camera parameters and to track the subject in real time.

The subject information forming means obtains the size of the subject from the difference image data obtained by the difference processing. Further, the distance data force obtained by the range finder also obtains the position of the subject.

In order to cause the tracking camera to automatically track the subject, it is necessary to control the camera parameters of the tracking camera and drive a drive device provided in the tracking camera.

As the camera parameters, a focus value for focusing the camera on the subject, a pan value and tilt value for aligning the subject on the subject, and the subject at the center of the screen, and There is a zoom value that determines the size of the image.

The camera parameter formation means also forms a focus value as the distance data force, forms a position force pan value and a tilt value of the subject of the subject information, and forms a zoom value from the size of the subject of the subject information.

The camera parameter formation means forms camera parameters of each tracking camera based on calibration data of the plurality of tracking cameras. Thereby, even if the tracking camera is disposed at an arbitrary position, it is possible to form camera parameters suitable for each tracking camera. it can. Each of the formed camera parameters is transmitted to the corresponding tracking camera, and controls a drive provided in the tracking camera. The driving device drives the tracking camera according to the transmitted camera parameters, controls pan and tilt to center the tracking camera on the center of the subject, and controls the zoom to adjust the size of the image so that it is projected on the screen. , Focus control by controlling.

In addition, when the position of the tracking camera relative to the predetermined area can be acquired, calibration is sequentially performed based on the position data to track the moving subject even if the tracking camera itself moves. can do.

The tracking device of the present invention is characterized by the arrangement configuration of the range finder and the fixed camera.

One embodiment of a tracking device according to the present invention is a tracking device for causing a tracking camera to track an object, comprising: a range finder for measuring a distance and an angle from a fixed position of an object within a predetermined area; A configuration that includes a fixed camera that acquires image data of a region, and position information of a range finder and the reference position of the fixed camera, and acquires distance data, angle data, and image data based on the reference position. Do.

Another embodiment of the tracking device according to the present invention is a tracking device for causing a tracking camera to track an object, and a range finder for measuring the distance and angle from a fixed position of an object within a predetermined area. And a fixed camera for acquiring image data of a predetermined area, and the range finder and the fixed camera are vertically overlapped and arranged on the same plane position, and distance data and angle data based on the same position. And acquire image data.

The fixed camera includes a plurality of imaging cameras arranged with different horizontal orientations, and the range finder is configured to arrange at least one above and Z or below the imaging cameras. In addition, two range finders can be arranged in the vertical direction.

The measurement range in the horizontal direction of the range finder includes the imaging range in the horizontal direction of the fixed camera, or the imaging range in the horizontal direction obtained by combining image data of a plurality of fixed cameras. By combining image data of a plurality of fixed cameras, a wide imaging range in the horizontal direction can be obtained. Note that the image data of multiple tracking cameras is horizontally overlapped Thus, the omission of image data can be prevented.

The present invention is also in the form of a motion capture device provided with the tracking device of the present invention.

The motion capture device of the present invention comprises at least one tracking camera and image data processing means for synchronously acquiring and storing imaging data of the tracking camera, and the tracking device is formed Drive control of each tracking camera based on the selected camera parameters to acquire tracking image data of the subject.

The form of the tracking method according to the present invention is a tracking method for causing a tracking camera to track an object, which comprises: measuring a distance and an angle from a fixed position of an object within a predetermined area; Step of acquiring image data of a region, step of forming object information which determines object region using distance data and angle data and image data, camera parameters of tracking camera using object information and the distance data Each step of the steps of forming Thereby, distance data, angle data, and the image data are acquired at a fixed position with respect to a predetermined area.

Further, the step of forming subject information includes a step of determining a subject area from difference image data obtained by subtracting background data from image data force, and a step of acquiring subject information of the position and size of the subject.

Further, in the step of forming the subject information, more specifically, the step of forming a candidate area for performing image difference from the distance data and the angle data, and the process from the image data to the background only in the candidate area. There are the steps of performing background subtraction processing to subtract data and obtaining difference image data to determine a subject area, obtaining a size of a subject from the difference image data, and obtaining a position of the subject as well as distance data strength. Prepare. Effect of the invention

According to the present invention, a moving subject can be automatically tracked. In addition, it is possible to project the subject largely on the screen of the imaging camera to be tracked and also to project it at the center of the screen.

Brief description of the drawings

FIG. 1 is a view for explaining an outline of a motion photographer. FIG. 2 is a view for explaining an outline of a motion capture device provided with the tracking device and the tracking device of the present invention.

FIG. 3 is a diagram for describing an outline of a flow of tracking processing according to the present invention.

FIG. 4 is a block diagram for explaining one configuration example of a tracking device of the present invention.

FIG. 5 is a flowchart for explaining the tracking process of the present invention.

FIG. 6 is a diagram for explaining setting of candidate areas according to the present invention.

FIG. 7 is a view for explaining a first mode of formation of a subject area according to the present invention.

FIG. 8 is a view for explaining a second mode of formation of a subject area according to the present invention.

FIG. 9 is a view for explaining background difference processing of the present invention.

FIG. 10 is a view for explaining the formation of pan and tilt of the present invention.

FIG. 11 is a view for explaining the configuration of the tracking device of the present invention.

FIG. 12 is a view for explaining data obtained by a plurality of fixed cameras and a plurality of range finders of the present invention.

FIG. 13 is a diagram for explaining processing when a plurality of range finders of the present invention form one subject.

FIG. 14 is a view for explaining an example of the system configuration of the motion capture device of the present invention.

FIG. 15 is a view for explaining an object image projected on a screen in a wide area motion scanner.

BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, embodiments of the present invention will be described with reference to the drawings.

FIG. 1 is a view for explaining an outline of a motion photographer. In a wide area motion camera, the motion of the subject is acquired in a two-step procedure.

[0058] FIG. 1 (a) shows the first step, in the initial frame, the position and orientation of each camera 4A-4D in a unified coordinate system such as the world coordinate system, a calibration board or a check pattern. The initial state is determined using a reference object 101 such as.

FIG. 1 (b) shows the second step, in which the behavior of each camera 4A-4D at each time point is acquired using the data processing apparatus 100 which acquires and records in synchronization with the video camera image in the next frame, , The positions of pan, tilt, zoom, and focus at each point are recorded, the state displacement from the initial state is acquired, and the calibration is dynamically changed. As a result, the subject moving in the wide area space is imaged while shaking the head of the camera, and the motion of the subject 111 is acquired by model analysis or the like.

FIG. 2 is a view for explaining an outline of a motion capture device provided with the tracking device and the tracking device of the present invention.

In FIG. 2, the motion capture device captures the motion of the subject 11 in the predetermined area 200 with a plurality of imaging cameras 4 (4A to 4D), and the data processing means 100 analyzes the subject 111 by model analysis or the like. Get the action.

According to the present invention, the imaging camera 4 (A4-4D) is provided with a driving device for changing the pan, tilt, zoom, focus, etc. of the camera, and the tracking camera (hereinafter referred to as tracking camera 4A-4D). Note

In this example, four tracking cameras 4A to 4D are shown. The number of force tracking cameras is not limited to four, and one or a plurality of optional tracking cameras can be used.

The present invention is provided with a tracking device for causing the tracking camera 4 to track the subject 11. The tracking device includes control means 1, range finder 2 (2 A, 2 B), and fixed camera 3. The range finder 2A, 2B and the fixed camera 3 are fixed at a predetermined position P with respect to the predetermined area 200, and the range finders 2A, 2B measure the distance and angle between the predetermined position P and the subject 11. . Also, the fixed camera 3 captures a background and an object 11 as viewed from the predetermined position P, and acquires camera image data.

In FIG. 2, an example of a configuration in which the range finder 2 and the fixed camera 3 are installed at the same two-dimensional position, and the range finder 2 A and the range finder 2 B are arranged at the upper limit position sandwiching the fixed camera 3. Is shown.

Fixed camera 3 outputs image data in which the entire predetermined area 200 is imprinted. Further, the range finder 2 outputs, on a plane in the predetermined area 200, data of the distance from the position force subject 11 where the range finder 2 is installed and the angle from the reference position. As shown in FIG. 2, when the two range finders 2A and 2B are arranged in the vertical direction, the distance data and angle data to the subject 11 in different planes in the predetermined area 200 are output.

By setting the installation heights of the range finders 2A and 2B to be different, distance data and angle data can be obtained. You can change the height of the plane to get the data. The plane scanned by the range finder 2 may be limited to a horizontal plane, and may be installed at a predetermined elevation angle.

The control means 1 forms camera parameters of each tracking camera 4 based on data acquired by the range finders 2 A and 2 B and the fixed camera 3, and sends these camera parameters to the respective tracking cameras 4. The camera parameters are formed in real time each time the subject 11 moves.

The drive device provided in the tracking camera 4A-4D automatically tracks the subject 11 by controlling pan, tilt, zoom and focus based on the camera parameters sent from the control means 1, The subject 11 is enlarged to the center of the screen of each camera.

FIG. 3 is a diagram for describing an outline of a flow of tracking processing of the present invention. In FIG. 3, according to the tracking processing of the present invention, camera parameters for causing each tracking camera to track the subject are formed based on the data obtained from the range finder and the fixed camera, and the camera camera uses the camera parameters to track the tracking camera. Control.

The range finder measures distance data r and angle data Φ to obtain a two-dimensional absolute position of the subject in a predetermined area. In addition, the fixed camera captures an image in a predetermined area, and acquires image data including background data and subject data.

[0071] As camera parameters, a pan for controlling the horizontal swing of the tracking camera, a tilt for controlling the vertical swing, and a size for determining the size of the subject displayed on the screen of the tracking camera. There is a focus to focus the camera of the camera.

Among these camera parameters, pan, tilt, and zoom are formed based on subject information including position information and size information of the subject, and focus is formed based on the distance between the tracking camera and the subject. Do. Note that the position of the subject can be set to any position on the subject. Since the normal subject is displayed at the center of the screen, it is appropriate to set the center of the subject.

[0073] First, pan, tilt, and zoom formation processing will be described. As described above, pan, tilt and zoom camera parameters are obtained from object information.

Here, the process of forming pan, tilt and zoom is facilitated by representing the subject information by a rectangular area (hereinafter referred to as a subject area) including the subject inside. this The pan, tilt, and zoom are camera parameters for making the tracking camera follow the subject and for capturing the subject's image as large as possible in the center of the camera's screen, and the screen is usually rectangular in shape. This is because it is sufficient if it is possible to grasp a rectangular area including the subject inside, and even the minute shape of the subject is unnecessary.

In the tracking process of the present invention, the subject region representing the subject information is obtained based on distance data and angle data obtained from the range finder, and image data obtained also for the fixed camera power. The subject area can be obtained from subject data included in the image data, and the image data strength can also be determined by background subtraction processing that subtracts the background data captured in advance.

When this background difference processing is performed on the entire area of a frame captured by a fixed camera, the processing time is long, and it is difficult to perform tracking in real time. The image captured by the fixed camera includes a background part that does not change even when the subject moves, and a subject part that changes with the movement of the subject. Background subtraction processing of the background part is used to acquire tracking data. Is an unnecessary process, and it is sufficient that only subject data related to the movement of the subject required for acquisition of tracking data is acquired.

The tracking processing of the present invention narrows down the area where the subject data may exist, and reduces the amount of data to be subjected to difference processing by performing background difference processing not on the entire area but only on a partial area. To increase processing speed. The tracking process of the present invention narrows the processing area based on the distance data and angle data obtained by the range finder. Hereinafter, this narrowed processing area is referred to as a candidate area. In the background subtraction processing, subject data is acquired by subtracting background data from image data only for candidate areas in image data captured by a fixed camera.

Pan and tilt are represented by the world coordinates V among the subject information represented by the subject area, and the position of the subject (the center position when the subject is photographed at the center of the tracking camera screen), Calculated by coordinate conversion to camera coordinates. Further, in the zoom, the focal length is obtained from the size of the subject in the subject information represented by the subject area, and the zoom value is obtained from the correspondence relationship between the focal length and the zoom value obtained in advance.

[0079] Next, focus formation processing will be described. As mentioned above, the turtle in focus The parameter is obtained from subject information formed based on the distance between the tracking camera and the subject. Here, the correspondence relationship between the distance and the focus can be obtained in advance, and the focus can be obtained by reading out the focus corresponding to the distance data obtained by the range finder.

In the formation of the camera parameters, calibration of tracking cameras installed at various positions with respect to a predetermined area is performed using calibration data of a range finder, a fixed camera, and a tracking camera which are obtained in advance. To create force melanometers corresponding to each tracking camera.

Here, when the position data of the tracking camera itself can be sequentially obtained, calibration can be performed based on the position data, so that the moving subject can be tracked by the moving tracking camera.

Next, one configuration example of the tracking device of the present invention will be described using the block diagram of FIG.

The tracking device includes a range finder 2 fixed to a predetermined area, a fixed camera 3 similarly fixed to the predetermined area, distance data and angle data obtained by measurement with the range finder 2, and a fixed camera There is provided a control means 1 which inputs image data obtained by imaging in 3 and forms camera parameters for causing each tracking camera 4 to track an object, and transmits the camera parameters to each tracking camera 4.

Control means 1 includes serial board la, candidate area forming means lb, video frame buffer 1c, background data Id, difference unit le, memory If, object information forming means lg, camera parameter forming means lh, transmitting means Have li.

The range finder 2 is a measuring means for detecting the distance to the subject and the angle, and for example, a laser range finder can be used. The laser range finder scans horizontally while changing the irradiation angle of the laser, acquires angle data based on the irradiation angle of each laser, irradiates the laser at the irradiation angle, and then reflects from the subject. The distance data to the subject is acquired by the time difference until the signal is detected. The distance data is the distance between the installation position of the range finder and the subject. A signal in the absence of the subject is obtained in advance, and the area where the subject is present is grasped by comparing the background data with the measurement data. And the distance between this area and the position where the range finder is installed. Also, the scanning angle when scanning in the horizontal direction The angle data at which the subject exists is acquired from. By fixing the range finder at a predetermined position, it is possible to detect the two-dimensional absolute position of the subject within the predetermined area.

[0085] A plurality of range finders 2 can be installed at the same position on the plane, and the detection height can be changed by making the installation heights different. A subject can be detected. Also, if one range finder can not cover the entire area of a given area, two-dimensional position data can be acquired by installing it at multiple locations.

The fixed camera 3 captures an image of a predetermined area to obtain image data of the area. The area captured by the fixed camera 3 is always the same regardless of the movement of the subject. By fixing the camera range finder and the fixed camera with respect to a predetermined area, it is possible to associate the positional relationship between data from which both measurement means can be obtained. For example, fixing the range finder and the fixed camera at the same position makes it easy to associate the distance and angle data obtained by the range finder with the image data obtained by the fixed camera.

The distance data and angle data acquired by the range finder 2 are input to the candidate area forming means lb via the serial board la as the range data recorded in the memory If. The candidate area forming means lb finds the area where the subject is present, and narrows down the area where the background difference processing is performed. The range data recorded in the memory If is a range image. Range Finder The obtained range data is a signal obtained by scanning and is a range image represented by a scanning angle and signal intensity at the scanning angle.

The image data captured by the fixed camera 3 is stored in the video frame buffer lc, and the image data corresponding to the area of the candidate area is read out from the video frame buffer lc by the candidate area forming means lb and the difference unit Perform background difference processing with background data Id stored in advance by le. In the background difference processing, the image data of the same pixel in the candidate area is read for each pixel from the video frame buffer lc and the background data Id, and the difference processor le performs difference processing for each pixel.

This differencing device le can be configured by hardware. This background difference processing is performed by performing difference processing for each pixel and configuring a difference unit with nodeware. The speed can be increased and differential images can be acquired in real time. The acquired difference image is recorded in the memory if. This difference image is used to form a subject area for forming a finder parameter together with the range image.

Further, the image data of the video frame buffer lc is recorded as a real image in the memory If, and is used for fine correction processing in formation of a subject area. This is because when performing subtraction based on color information in background subtraction processing, if the subject color is close to the background color, subject data can not be acquired correctly because the subject part is removed as the background. In order to eliminate this problem, the actual image is used to correct this scraped portion.

Object information (object region) forming means lg inputs a range image, a difference image, and an actual image from the memory If. As described above, the input difference image is finely corrected using the real image, and a rectangular area including the corrected difference image is formed as a subject area. If fine correction with a real image is not necessary, the subject area may be formed directly from the difference image which does not need to be finely corrected the difference image.

Further, the subject information (subject region) forming means lg obtains the position of the subject as well as the range image power. At this time, as described above, it is usually appropriate to find the center of the subject. 1S You may set another part as the position of the subject if necessary.

The camera parameter formation means lh receives information of the size of the subject and the position of the subject from the subject information formation means lg, forms the camera parameters of each tracking camera as described above, and forms the formed camera parameters. Transmit to each tracking camera 4.

Hereinafter, a flowchart for explaining the tracking process shown in FIG. 5, a diagram for explaining the setting of the candidate area shown in FIG. 6, and a first embodiment of the formation of the subject area shown in FIG. , FIG. 8 is a diagram for explaining the second mode of formation of the subject region, FIG. 9 is a diagram for explaining the background difference processing, and FIG. 10 is a diagram for explaining the formation of pan and tilt This will be described using The procedure of the tracking process is indicated by the number attached to (Step S) shown in the flowchart of FIG.

FIG. 6 (a) schematically shows the detection state of the subject by the range finder, O indicates the reference point of the range finder, and A indicates the distance to the detection point as well. . Since the range finder scans in the horizontal direction, the locus shown in A Contains distance data. When there is no subject, the position on the locus A can be regarded as a wall surface, and has a fan-like shape. When there is a subject, the distance from the reference point O is detected as a short area. For example, B is a region where the distance is rl and the angle is θ 1, and B is a region where the distance is r 2 and the angle is Θ2. The distance and angle of this area can be represented by an average value because the subject has a large width. In the subsequent background difference processing, this area is used to narrow the range in which the background difference processing is performed.

The first form of narrowing down of candidate areas is to narrow down candidate areas by extracting points connected to the area from image data.

FIG. 6 (b) shows the relationship between the range data and the image data, and when the subject 11A, 11B is present in the image data, the range data corresponding to the subject 11A, 11B is placed in the range data. Data is detected. Since the range finder detects a subject on a plane, the range data 20 is represented by areas 21A and 21B on the line 22 shown by broken lines in FIGS. 6 (b) and 6 (c). Ru. FIG. 6 (d) shows the image data corresponding to FIG. 6 (c), and the subjects 31A and 31B are displayed. (Step S1)

In this mode, while narrowing down the candidate area, the background difference is performed to determine the subject area. FIG. 7 is a diagram for explaining this process.

FIG. 7 (a) shows the subject 31 in the image data 30, and FIG. 7 (b) shows the line 22 and the area 21 in the range data 20. FIG. Here, the subject 31 can be viewed as a collection of points continuously connected from the area 21. Therefore, a group of points (pixels) connected to the area 21 is sequentially determined (Fig. 7 (c), (d), (e)), and a rectangular area D obtained thereby is set as a candidate area (Fig. 7 (f)). Difference processing is performed, and a pixel whose value obtained as the difference is equal to or more than a threshold value is set as a subject. (Step S2)

In the second form of narrowing down candidate areas, a virtual subject space in a three-dimensional space is a candidate area, and this space is projected on the screen of a fixed camera, and the space is projected. The background difference processing is performed only for the pixels, and the pixels having the value obtained by the difference and the threshold value or more are set as the subject.

FIG. 8 shows a state in which a virtual subject space in a three-dimensional space is projected on the screen of a fixed camera. Range data is a value of two-dimensional coordinates, which is one plane of the world coordinate system Can be converted as a point (XY plane) of This means that the subject (the center of the subject) exists somewhere on a straight line (this straight line is 1) extending in the direction perpendicular to the plane from that point.

On the other hand, when the image data of the fixed camera is converted to the world coordinate system, one straight line (this straight line is assumed to be m) is determined in the three-dimensional world coordinate system as one point on the two-dimensional image. .

For each pixel of the fixed camera, the projection is performed on the world coordinate system, and the minimum distance between the straight line 1 and the straight line m is calculated. Since the approximate center of the subject is present somewhere on the straight line 1, background subtraction processing is performed only for objects within a certain range of the calculated distance, and pixels with this difference are taken as the subject.

FIG. 9 is a diagram for describing the narrowing down of the candidate area and the background difference. FIG. 9 (a) shows the image data 30, and FIG. 9 (b) shows the background data. The image data 30 contains data of the subject 31 and the background 41, and the background data 40 contains data of the background 41. In each data, the data of the range of the candidate area is taken out. FIG. 9 (c) shows image data in the candidate area 50, and FIG. 9 (d) shows background data in the candidate area 50. FIG. A difference image is obtained by performing background difference processing on the image data and background data in these candidate areas (Fig. 9 (e)).

The data strength of the difference image The subject information can be calculated, for example, as follows. For all pixels considered to be subject areas, calculate the three-dimensional coordinates of two points on straight line 1 and straight line m that constitute the minimum distance for the pixels with differences, and calculate the Z coordinates of the points on straight line 1 Calculate the minimum and maximum values. Two-dimensional coordinates (xy coordinates) and (maximum value minimum value in the world coordinate system converted from the range finder by using (maximum value-minimum value) as the size of the object using the minimum value and maximum value thus determined ) The three-dimensional coordinates plus Z2 + minimum value is the position of the subject.

In addition, for the distance to the subject, the distance force between the origin coordinate and the point of the camera coordinate of the tracking camera in the world coordinate system is acquired. (Step S2)

The subject area determined from the background difference data is finely corrected using the actual image as described above (step S3), the camera parameters are determined (step S4), and the determined camera parameters are determined. The tracking camera is controlled based on it (step S5). Repeat this process Tep S6).

In the formation of the camera parameters, the tracking camera obtains the correspondence between the world coordinate system and the camera coordinate system by calibration. At this time, a point P on the world coordinate system can be imprinted at the center of the image by the line passing through the point P overlapping the Z axis of the camera coordinate system which is the optical axis. FIG. 10 shows this positional relationship. At this time, the angular force between the straight line and the XY plane of the camera coordinate system is determined, and the angular force between the straight line and the YZ plane of the camera coordinate system is also determined as the tilt movement angle.

Also, the zoom can be controlled by the focal length of the tracking camera. The focal length f can be obtained by f = (distance from lens to subject) X (image size) Z height (width) of subject. Since the camera specification power is also known as the image size, the zoom data can be obtained by acquiring the distance data power of the range finder from the lens to the object and acquiring the height (width) of the object from the object information. it can. Note that, by obtaining the correspondence between the focal length and the pulse value for control, it is possible to calculate the pulse value for performing zoom control of the focal length force tracking camera.

Further, since the focus is related to the distance from the lens to the subject according to the specification of the lens, the focus value can be obtained by acquiring the distance from the lens to the subject from the distance data of the range finder. Note that by finding the correspondence between the distance and the pulse value for control, it is possible to calculate a pulse value for performing focus control of the distance force tracking camera.

Next, a configuration example of the tracking device will be described.

The tracking device of the present invention is a tracking device for causing the tracking camera to track an object, and is characterized in the arrangement configuration of the range finder and the fixed camera, and the range finder and the fixed power camera are overlapped in the vertical direction. Arrange on the same plane position and acquire distance data, angle data and image data based on the same position.

FIG. 11 (a) shows an example of the configuration of the tracking device of the present invention. In this embodiment, range finders 2A and 2B are provided above and below the fixed camera 3. The range finder 2A, 2B and the fixed camera 3 are mounted on a vertically erected support and placed on the same plane in the same position. By this, distance data, angle data and image data based on the same position are obtained. Get the

FIG. 11 (b) shows another form of the configuration of the tracking device of the present invention. This form is configured to include a plurality of fixed cameras 3.

The fixed cameras 3A, 3B, 3C are arranged with different horizontal orientations, and the range finder 2

At least one of A and 2B is disposed above and Z or below fixed cameras 3A, 3B and 3C. Further, the range finders 2A and 2B can be configured to arrange two in the vertical direction.

The measurement range in the horizontal direction of the range finder 2A, 2B is the horizontal imaging range obtained by combining the horizontal imaging range of the fixed cameras 3A, 3B, 3C or the image data of a plurality of tracking cameras. Include range. By combining the image data of the plurality of fixed cameras 3A, 3B, 3C, a wide imaging range in the horizontal direction can be obtained. By providing an overlap in the horizontal direction to the image data of a plurality of fixed cameras, omission of the image data can be prevented.

FIG. 12 is a diagram for explaining data obtained by a plurality of fixed cameras and a plurality of range finders. Here, the line 22 of the range data is shown superimposed on the image data 30.

FIG. 12 (a) is an example of image data and range data when one fixed camera and one range finder are combined. The subject 31 is formed only when it is present in the image data 30 at a position overlapping the line 22 of the range data.

FIG. 12 (b) is an example of image data and range data when three fixed cameras and one range finder are combined. The image data 30 can be formed by connecting image data 30A, 30B and 30C obtained from three fixed cameras. The image data 30A, 30B, and 30C can be prevented from missing portions of the image by capturing with overlapping portions.

Further, FIG. 12 (c) is an example of image data and range data when three fixed cameras and two range finders are combined, and the range finder is arranged in the vertical direction. Obtain range data at different heights.

Image data 30 includes image data 30A, 30B, and 30C obtained from three fixed cameras. It can be formed by unifying, and two range data lines 22A, 22B by two range finders can be obtained for this image.

The plurality of range finders can acquire range data at different heights, and therefore, subjects 31a present at low positions and subjects 31c present at high positions can also be detected.

In the configuration provided with a plurality of range finders as shown in FIG. 12 (c), one range subject may be detected by a plurality of range finders. In such a case of determining the subject area, the subject area can be formed as one subject by applying a process called a so-called exudation method.

FIG. 13 is a diagram for explaining processing when the plurality of range finders form one object.

FIG. 13 (a) shows the relationship between the subject 31 and the two range data lines 22A and 22B. In this state, as shown in FIG. 13 (b), areas 21a, 21b, 21c of range data for detecting an object at a plurality of places are obtained. Under this condition, each range data can not distinguish whether a different subject or a single subject is detected.

Therefore, as shown in the first form of formation of the subject area, the pixels connected to each of the areas 21a, 21b and 2lc are extracted (FIGS. 13 (c), (d), (e) )) It is determined whether or not the connection between these pixels is generated, and this determination is made to distinguish whether it is a different subject or a single subject.

FIG. 14 is a diagram for explaining an example of the system configuration of the motion capture device.

In FIG. 14, the imaging cameras 4A to 4D, which are tracking cameras, are synchronized by the synchronization signal from the external synchronization signal generator 10. In addition, data of pan, tilt, zoom, and focus of each imaging camera 4A-4D is also collected by the data collector 6 in synchronization. Also, the data collector 6 receives the frame count from the frame counter 5 to obtain data synchronization between frames.

The data control device 7 synchronizes the data of the camera parameters and the image data of the imaging camera 4A-4D in a synchronized manner, and accumulates them in the video data storage device 8. The host control unit 9 stores data stored in the video data storage unit 8 and a camera parameter storage unit. Data can be input and simulation can be performed.

Industrial applicability

The present invention can be applied to various fields such as analysis of exercise and application to rehabilitation, application to sports broadcasting, and application to an environment in which a person can not operate a camera.

Claims

The scope of the claims

[1] A tracking device for causing a tracking camera to track an object,

A range finder that measures the distance and angle from the fixed position of the object within a predetermined area;

A fixed camera for acquiring image data of the predetermined area;

Subject information forming means for forming subject information for determining a subject area using distance data and angle data acquired by the range finder and image data acquired by the fixed camera;

Camera parameter forming means for forming camera parameters of a tracking camera using the subject information and the distance data;

A tracking device characterized in that the range finder and the fixed camera are fixed with respect to a predetermined area.

[2] A tracking device for causing a tracking camera to track an object,

A fixed camera for acquiring image data of the predetermined area;

Based on the distance data and angle data acquired by the range finder, the processing area of the subject is narrowed down from the image data acquired by the fixed camera, and the candidate area narrowed by the narrowing-down is subjected to image analysis. Subject information forming means for forming subject information for determining

[3] The subject information forming means determines a subject area by subtracting image data from background data from the image data, and acquires subject information of the position and size of the subject. The tracking device described in.

[4] A candidate for performing image difference from distance data and angle data acquired by the range finder Candidate area forming means for forming a complementary area;

And a differencer for subtracting background data from the image data,

The differencer performs background difference processing to subtract background data from image data only in the candidate area to obtain difference image data.

The tracking device according to claim 1 or 2, wherein the subject information forming means obtains the size of the subject from the difference image data, and obtains the position of the subject with the distance data force.

[5] The camera parameters are a focus value, a pan value, a tilt value, and a zoom value, and the camera parameter forming unit forms a focus value from the distance data, and calculates a position force pan value and a tilt value of the object. The tracking device according to any one of claims 1 to 4, wherein the zoom value is formed based on the size of the subject.

[6] The tracking device according to claim 5, wherein the camera parameter forming means forms camera parameters of each tracking camera based on calibration data of a plurality of tracking cameras.

[7] The subtracter according to claim 4, wherein the subtracter has a circuit configuration that performs, for each pixel, differential processing between image data sequentially input from the fixed camera and background data acquired in advance. Tracking device.

[8] A tracking device for causing a tracking camera to track an object,

A fixed camera for acquiring image data of the predetermined area;

A tracking device comprising: position information of a reference position force of the range finder and the fixed camera; and acquiring distance data, angle data, and image data based on the reference position.

[9] A tracking device for causing a tracking camera to track an object,

A fixed camera for acquiring image data of the predetermined area; A tracking device, wherein the range finder and the fixed camera are vertically overlapped and disposed at one common position, and distance data, angle data, and image data based on the common position are acquired.

[10] The fixed camera includes a plurality of tracking cameras arranged with different horizontal orientations, and the range finder is characterized in that at least one is arranged above and Z or below the tracking cameras. A tracking device according to claim 2 or 9.

11. The tracking device according to claim 2, wherein two range finders are arranged in the vertical direction.

[12] The measurement range in the horizontal direction of the range finder includes an imaging range in the horizontal direction of the fixed camera or an imaging range in the horizontal direction obtained by combining the image data of the plurality of tracking cameras. The tracking device according to any one of claims 1, 2, 8, 9, 10 or 11, which is characterized.

[13] The tracking device according to any one of claims 1 to 12,

At least one tracking camera,

Image data processing means for synchronously acquiring and accumulating imaging data of the tracking camera;

A motion capture device characterized in that the tracking device drives and controls each tracking camera according to a formed camera parameter to acquire tracking image data of a subject.

[14] A tracking method for causing a tracking camera to track an object,

Measuring a distance and an angle from a fixed position of the object in a predetermined area; acquiring image data of the predetermined area;

Forming subject information for determining a subject area using the distance data and angle data and the image data;

Forming camera parameters of a tracking camera using the subject information and the distance data;

The tracking method, wherein the distance data, the angle data, and the image data are acquired at a fixed position with respect to a predetermined area.

[15] A tracking method for causing a tracking camera to track an object,

Based on the distance data and angle data acquired by the range finder, the processing area of the subject is narrowed down from the image data acquired by the fixed camera, and the candidate area narrowed by the narrowing-down is subjected to image analysis. Forming subject information to determine

[16] The step of forming the subject information includes:

The step of determining the image data area and the subject area by subtracting the image data and background data;

The tracking method according to claim 14 or 15, further comprising the step of acquiring subject information of the position and size of the subject.

[17] The step of forming the subject information includes:

Forming a candidate area for performing image difference from the distance data and the angle data;

Performing background subtraction processing to subtract image data strength background data only in the candidate area to acquire difference image data and determine a subject area;

Acquiring the size of the subject from the difference image data;

The tracking method according to claim 14 or 15, further comprising the step of acquiring the position of a subject from the distance data.