WO2011136405A1

WO2011136405A1 - Image recognition device and method using 3d camera

Info

Publication number: WO2011136405A1
Application number: PCT/KR2010/002669
Authority: WO
Inventors: 강인배
Original assignee: (주)아이티엑스시큐리티
Priority date: 2010-04-28
Filing date: 2010-04-28
Publication date: 2011-11-03
Also published as: KR20110119893A; KR101148029B1

Abstract

Disclosed are an image recognition device and method using a 3D camera. An image processing device of the present invention calculates the area of a moving object by using 3D depth map data obtained with two cameras, and recognizes whether the moving object is a particular object depending on whether the calculated area is included within a preset range. To this end, the image recognition device calculates the area of the object extracted, in addition to from the step of generating the 3D depth map data, from the steps of: extracting the moving object; and extracting an outline of the extracted object.

Description

Image recognition device and method using 3D camera

The present invention relates to an image recognition apparatus and method using a 3D camera that can recognize an object based on 3D depth map data acquired using two cameras.

In order to obtain a depth map on a three-dimensional space from an image, that is, a distance from a subject in a three-dimensional space, a method of using a stereo camera, a method of using a laser scan, or a method of using a time of flight (TOF) The back is known.

Among these, stereo matching using a stereo camera is a hardware implementation of a process of recognizing a stereoscopic object using two eyes, and a pair of images obtained by photographing the same subject with two cameras. It is a method of extracting information about depth (or distance) in space through the interpretation process of. To this end, binocular differences on the same Epipolar Line of images obtained from two cameras are calculated. The binocular difference includes distance information, and the geometrical characteristic calculated from the binocular difference becomes the depth. When the binocular difference value is calculated in real time from the input image, three-dimensional distance information of the observation space can be measured.

Known as a stereo matching algorithm, for example, "image matching method using a plurality of image lines" of the Republic of Korea Patent No. 0517876 or "binocular difference estimation method for three-dimensional object recognition" of the Republic of Korea Patent No. 0601958.

An object of the present invention is to provide an image recognition apparatus and method using a 3D camera, which can recognize a subject based on 3D depth map data acquired using two cameras.

In order to achieve the above object, an image recognition method using a 3D camera of the present invention includes converting a pair of analog images photographed using two cameras photographing the same area into a digital image, and the converted pair of Computing 3D depth map data using a digital image, Extracting a region of a moving object by comparing one of the digital image with a reference background image, Based on the distance information to the object identified from the depth map data And calculating the area of the object and recognizing the object by a method of determining whether the calculated area is within a preset range of the area of a particular object.

The extracting of the area of the object may include performing a subtraction operation on one of the digital images and the reference background image, and detecting an outline of the object extracted as the subtracted image. .

In the calculating of the area of the object, the total area of the object may be obtained by obtaining a unit area of pixels located at the distance of the extracted object, and then multiplying the number of pixels included in the outline. .

An image recognition apparatus according to another embodiment of the present invention includes a stereo camera unit, a distance information calculator, an object extractor, and an object recognizer. The stereo camera unit includes two cameras photographing the same area to convert a pair of analog images photographed into digital images. The distance information calculator calculates 3D depth map data using a pair of digital images generated by the stereo camera unit, and the object extractor compares one of the digital images generated by the stereo camera unit with a reference background image. Extract the area of the object.

The object recognition unit calculates the area of the object extracted by the object extracting unit based on the distance information to the object identified from the depth map data calculated by the distance information calculating unit, and the calculated area is a preset range as the area of a specific object. Recognize the object in a way to determine if it belongs to.

The image recognition device of the present invention can recognize a moving object in the photographing area in a simpler method. In this method, the recognition algorithm is relatively simple compared to the two-dimensional image processing, instead of processing the image generated using the two cameras, the recognition speed and efficiency is improved, and above all, the recognition rate is excellent.

1 is a block diagram of a 3D image recognition device according to an embodiment of the present invention;

2 is a flowchart provided to explain a three-dimensional image recognition process of the present invention;

3 is a diagram illustrating an image processing result in an object extraction step; and

4 is a diagram provided to explain a method for calculating an area of an object.

Hereinafter, the present invention will be described in more detail with reference to the accompanying drawings.

Referring to FIG. 1, the image recognition device 100 of the present invention includes a stereo camera unit 110 and an image processor 130 to recognize a subject in a three-dimensional space.

The stereo camera unit 110 includes a first camera 111, a second camera 113, and an image receiver 115.

The first camera 111 and the second camera 113 are a pair of cameras spaced apart from each other to photograph the same area, and are called a stereo camera. The first camera 111 and the second camera 113 output an analog image signal photographing an area to the image receiver 115.

The image receiver 115 converts a video signal (or image) of a continuous frame input from the first camera 111 and the second camera 113 into a digital image and synchronizes the frame to the image processor 130 in synchronization with the frame. to provide.

According to an exemplary embodiment, the first camera 111 and the second camera 113 of the stereo camera unit 110 may be a camera that outputs a digital video signal instead of an analog image. In this case, the image receiver 115 may be different. It provides an interface with the image processor 130 without conversion processing and serves to match frame synchronization of a pair of images.

In addition, the stereo camera unit 110 may further include a wired or wireless interface for connecting to the image processing unit 130 through an IP (Internet Protocol) network.

The image processor 130 extracts an area of an object moving on the shooting area from the pair of digital image frames output from the stereo camera unit 110 to determine whether the object is an object of interest, and continuously from the stereo camera unit 110. The above determination process may be performed in real time on all frames of the image (video) that is input to the image.

For the above process, the image processor 130 includes a distance information calculator 131, an object extractor 133, and an object recognizer 135. Hereinafter, operations of the distance information calculator 131, the object extractor 133, and the object recognizer 135 will be described with reference to FIG. 2.

First, when the first camera 111 and the second camera 113 generates an analog video signal, the image receiver 115 converts the analog video signal into a digital video signal and then synchronizes the frame to the image processor 130. Provided to (S201, step S203).

Depth Map Data Calculation: Step S205

The distance information calculator 131 calculates 3D depth map data including distance information of each pixel from a pair of digital images received in real time from the image receiver 115.

Here, the distance information of each pixel is binocular difference information obtained by the stereo matching method described in the prior art, and the "three-dimensional image matching method using a plurality of image lines" of Korean Patent No. 0517876 or the Korean Patent No. 0601958. The depth map data calculated by the distance information calculator 131 may include distance information about each pixel. .

The object extractor 133 extracts a region of the moving object from one image of the pair of digital images input through the image receiver 115. Here, the moving object refers to an object existing in the photographing area of the camera and an object whose position or motion is changed or newly entered into the photographing area.

The method of extracting the area of the moving object may be variously performed. For example, the object extracting unit 133 of the present invention extracts a region of a moving object by a method of subtracting a background image previously held from an input image frame. Here, the subtraction operation is performed by subtracting pixel values of each pixel of two corresponding image frames. In addition, the reference background image is an image in which no moving object is set, and the object extractor 133 may store and use the reference background image in a storage medium (not shown).

Furthermore, since a difference image may be generated due to camera noise or a change in illumination of the photographing area, the object extractor 133 may perform a Gaussian distribution on the resultant image of the subtraction operation. Background Modeling, which applies the Distribution process, can cope with noise or light changes.

Referring to FIG. 3, (a) is an image input from the image receiver 115, (b) is a basic background image, and (c) is a result image of a subtraction operation. Referring to FIG. 3C, it can be seen that a region of a moving object is extracted from an image input from the image receiver 115.

The object extractor 133 detects an outline of a moving object by performing outline detection on the resultant image of the subtraction operation of step S207. Edge detection is handled using different types of edges, depending on the borderline width and shape of the object.

The object extractor 133 may remove a noise by applying a morphology operation to a subtraction image and simplify an outline or a skeleton line to detect an outline. The morphology operation can basically use erosion operation to remove noise and dilation operation to fill small holes in an object.

<Calculate Area of Moving Object: Step S211

The object recognizer 135 uses the outline information extracted by the object extractor 133 and the 3D depth map data calculated by the distance information calculator 131 to calculate the area of the object extracted in step S209. Calculate

In calculating the area of an object, the actual area per pixel (hereinafter referred to as the 'unit area' of the pixel) at the distance l where the object extracted in step S209 is located is obtained, and then the pixel included in the outline of the object is calculated. This is done by multiplying numbers.

Referring to FIG. 4, the actual area M corresponding to the entire frame at the maximum depth L based on the existing background image, and the actual area m corresponding to the entire frame at the position l of the extracted object. l) is displayed. First, the actual area m (l) corresponding to the entire frame at the distance l at which the object is located may be obtained as in Equation 1 below.

Equation 1

Here, M is an actual area corresponding to the entire frame (eg, 720 × 640 pixels) at the maximum distance L based on the existing background image.

Next, by dividing the actual area m (l) corresponding to the entire frame at the distance l where the object is located by the number of pixels P (for example, 460,800 = 720 × 640) of the entire frame, The unit area m _p (l) of the pixel is obtained as in Equation 2 below.

Equation 2

Where P is the total number of pixels. According to Equation 2, it can be seen that mp (l) depends on the distance (l) to the corresponding object confirmed from the distance information of the 3D depth map data.

Finally, the area of the object can be obtained as shown in Equation 3 by multiplying the unit area mp (l) of the pixel by the number of pixels included in the outline (pc).

Equation 3

Where pc is the number of pixels included in the object.

The object recognition unit 135 recognizes the object by comparing the area of the object obtained in operation S211 with a preset value. For example, if the area of the object is less than or equal to the first size, it may be recognized as an animal moving in four legs, or if the size of the object is within the first range, it may be recognized as an automobile.

Through the above process, the image recognition device of the present invention obtains 3D depth map data using a stereo camera and recognizes an object captured in the image.

Contrary to what is described herein as time series preceding, the calculation of the depth map data of step S205 can be performed in parallel with the extraction process of the moving object of steps S207 and S209 as shown in FIG. And after step S209.

Although the above has been illustrated and described with respect to preferred embodiments of the present invention, the present invention is not limited to the above-described specific embodiments, it is usually in the technical field to which the invention belongs without departing from the spirit of the invention claimed in the claims. Various modifications can be made by those skilled in the art, and these modifications should not be individually understood from the technical spirit or the prospect of the present invention.

Claims

Generating a pair of digital images using two cameras photographing the same area;

Calculating 3D depth map data using the converted pair of digital images;

Extracting an area of a moving object by comparing one of the digital images with a reference background image;

Calculating an area of the object based on distance information to the object identified from the depth map data; and

And recognizing the object by a method of determining whether the calculated area is within a predetermined range of an area of a specific object.
The method of claim 1,

Extracting the area of the object,

Performing a subtraction operation on one of the digital images and the reference background image; And

And detecting an outline of an object extracted as the subtracted image.
The method of claim 2,

The calculating of the area of the object may include obtaining a unit area of pixels located at a distance of the extracted object, and then multiplying the number of pixels included in the outline to obtain a total area of the object. Image recognition method using.
A stereo camera unit having two cameras for capturing the same area and generating a pair of digital images;

A distance information calculator for calculating 3D depth map data using a pair of digital images generated by the stereo camera unit;

An object extracting unit extracting an area of a moving object by comparing one of the digital images generated by the stereo camera unit with a reference background image; And

The area of the object extracted by the object extracting unit is calculated based on the distance information to the object identified from the depth map data calculated by the distance information calculating unit, and whether the calculated area is within a preset range as an area of a specific object. An image recognition apparatus using a 3D camera, characterized in that it comprises an object recognition unit for recognizing the object as a method of determining.
The method of claim 4, wherein

The object extraction unit,

The object is extracted by performing a subtraction operation on one of the digital images generated by the stereo camera unit and the reference background image, and then the outline of the extracted object is detected to calculate the area. Video recognition device.